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improved proteases and methods for prodeciftQ thesrii 

FIELD OF INVENTION 

A number of rnfcrobially derived related proteases are notably difficult to produce In 
5 industrially relevant yields, they may be prone to various types of degradation and/or 
Instabilities. The present invention provides methods for producing such proteases by 
expressing them with C-termlnai amino acid extensions and/or modifications of an existing G~ 
terminus, The invention further provides the resulting proteases comprising such amino acid 
extensions., 

10 The present invention relates to isolated polypeptides having protease activity 

related to a NoGartfiopsis sp. protease and Isolated nucleic acid sequences encoding such 
proteases. The Invention furthermore relates to nucleic add constructs, vectors, and Host cells 
comprising these nucleic acid sequences as well as methods for producing and using the 
proteases, in particular within animal feed. 



BACKGROUND 

Polypeptides having protease activity, or proteases, are sometimes also designated 
peptidases, proteinases, peptide hydrolases, or proteolytic enzymes. Proteases may be Of tie 
exo4ype that hydrolyses peptides starting at either and thereof, or of the endo-type that act 
20 internally In polypeptide chains (endopeptldases), Endopeptldases shew activity on N- and 0- 
terminally blocked peptide substrates that are relevant for the specificity of the protease in 
question. 

The term "protease* is defined herein as an enzyme that hydrolyses peptide bonds, it 
includes any enzyme belonging to the EG 3.4 enzyme group (including each of the thirteen 

25 subclasses thereof). The EC number refers to Enzyme Nomenclature 1992 from NC-1UBMB, 
Academic Press, San Diego, California, including supplements 1~S published in Eur. X 
Biochem. 1994, 223, 1-5; Eur. J, Biochem, 1995, 232, 1-6; Eur. J. Biochem. 1996, 237, 1-5; 
Eur. >i. Biochem. 1997, 250, 1-6; and Eur. J. Biochem. 1999, 264, 610-650: respectively. The 
nomenclature Is regularly supplemented and updated; see e.g. the World Wide Web (WWW) 

30 at htlp:/Ai^.chero^ 

US patent publication. No. 2002/0182S72A1 discloses, that if one or two of the last 
two amino acids at the C4ermlnus of a polypeptide Is/are diarged polar; D or E (negatively 
charged) or K, R, or H (positively charged), the tail would he considered polar, charged, and 
this makes the polypeptide resistant against proteolytic degradation by a subclass of proteases 

35 that recognise non-polar C -terminal talis of secreted proteins. 



is 
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Another disclosure reported, that proline residues at the C-ien-mnus of nascent 
polypeptide chains Induce degradation of the polypeptide (2002, Prolln residues at the G 
terminus of nascent ohafrts induce SsrA tagging during translation termination. J.BioLChem. 
277:33825-33823). 

5 

SUMMARY OF THE INVENTION 

It is a well-known problem in the art .of expressing polypeptides having proteolytic 
activity, that, many of such polypeptides are inherently unstable, they may be subject to 
autoproteoiysiS; or they may be targeted for degradation by other proteases already during 

10 their production, resulting in sub-optimal yields, Mmy other factors may contribute to their 
Instability, not ail of which are understood at present It is of great interest to provide proteolytic 
polypeptides, with an increased stability, so that they may be produced in higher yields. 

The present inventors provide herein proteolytic polypeptides of the 82A. and/or S1E 
protease classification, that comprise at least three non-polar or uncharged polar amino acids 

1 5 within the last four amino acids of the €4erminus of the polypeptide. The configuration of the at 
least three non-polar or uncharged amino acid residues may be achieved by adding one or 
more amino acid(s) as a fusion-tail to the polypeptide, for instance by modifying the encoding 
polynucleotide to also encode the additional amino aeid{s). Another way could be to modify 
one or more existing C-terminai amino aoid(s) in the polypeptide. These particular amino acid 

SO configurations at the C-terminus of the polypeptide of the invention resulted In much improved 
yields as compared to the yields of polypeptides that did not have these C-terminal amino acid 
configurations, under otherwise identical conditions of production. 

Accordingly, in a first aspect the invention relates to a secreted polypeptide which 
has protease activity, preferably alpha-lytic epdppeptidase activity, which polypeptide 

25 comprises at least three non-polar or uncharged polar amino acids within the last four amino 
acids of the C-terminus of the polypeptide, and which polypeptide; 

(a) comprises an amino acid sequence which is at least 70%, or preferably 75%, 80%, 
85%, 86%, 87%, 88%, 80%, 90%, 91%, 92%, 03%, 94%, 95%, 96%, 57%, 08%, or 
09% identical to the amino add sequence of the mature part of the polypeptide 

30 shown in SEQ ID MO: 28; SEQ ID NO; 33; SEQ ID NO: 37; SEQ ID NO: 41; SEQ 

ID NO: 43; or SEQ ID NO; 43; 

(b) comprises an amino acid sequence which is at least 70%, or preferably 75%, 80%, 
85%, 85%, 87%, 88%, 89%, 90%, 01%, 92%, 03%, 94%, 95%, 96%, 97%, 08%, or 
99% Identical to the amino add sequence of the the mature part of the polypeptide 

35 encoded by the polynucleotide In SEQ ID NO; 1; SEQ ID NO: 2; SEQ ID NO; 25; 
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SEQ ID NO: 31; SEQ ID NQ: 32; SEQ ID NO: 36; SEQ ID NO: 40: or SEQ ID NO: 

44; 

(c) is encoded by a nucleic acid sequence which hybridizes under vary low, low. 



medium-low, medium, medium^h, .High, or very high stringency conditions with: 
5 (i) a polynucleotide encoding the mature pari of a protease, said 

polynucleotide obtainable from genomic DMA from Nocard&psis 
da$$mv$® s0$p. dasmmiiMWM 43235 by use of primers SEQ ID 
HQ's: 28 and 2?jT<mNoosrdiopmsAiha DSM 15847 by use of primers 
SEQ ID NO's; 34 ami 35; from Nocartiiopsts prasina DSM 15648 by use 
10 of primers SEQ ID HQ's: 38 and 39; or from Nacartiiopsis prasina DBM 

1 $649 by use of primers SEQ ID NO's; 42 and 39; 
(I!) the polynucleotide of SEQ ID NO;. 1; of SEQ ID NO: 2; of SEQ ID NO: 
25; of SEQ ID HO: 31; of SEQ ID NO: 32; of SEQ ID NO; 38; of SEQ ID 
NO: 40; or of SEQ ID NO: 44; 
is (IN) a subsequence of (!) or (II) of at least 500 nucleotides, preferably 400, 

300, .200, or 100 nucleotides, or 
$V) a complementary strand of (I), (it), or (Hi); 
(d> comprises a mature part which is a variant of the mature part of the polypeptide 
having the amino acid sequence of SEO SO NO: 28; SEQ 3D NO; 33; SEQ ID NO; 
20 37; SEQ ID NO: 41; SEQ ID NO: 43; or SEQ ID NO: 45, comprising a substitution, 

deiaiion. extension, and/or Insertion of one or more amino acids; 

(e) is an allelic variant of (a), (b), (o), or (d); or 

(f) is a fragment of (a), (b), (c), (d), or (a). 

.Preferably the polypeptide belongs to the S2A, or the S 1 E peptidase families. 
25 in a second aspect, the invention relates to an isolated polynucleotide encoding a 



polypeptide as defined in the first aspect. 

Still, in a third aspect the invention relates to a recombinant expression vector or 
polynucleotide construct comprising a polynucleotide as defined In the previous aspect. 

Yet a fourth aspect relates to a meomblnant host ceil comprising a polynucleotide as 
30 defined In the second aspect, or an expression vector or polynucleotide construct as defined in 
the previous aspect. 

In a fifth aspect, the invention also relates to a transgenic plant, or plant part, 
comprising a polynucleotide as defined in the second aspect, or an expression vector or 
polynucleotide construct as defied Id the third aspect. 
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The sixth aspect of the invention ralafes tea tmnsgenie, non-human animal, or 
products, or elements thereof, comprising a potynucieotio!e as defined m the second aspect, or 
an expression vector or polynucleotide construct as defined in the third aspect. 

The seventh aspect of the Invention relates to a mMhod for producing a polypeptide as 
5 defined in the first aspect , the method comprising; (a) oulifvaSng a recombinant host ceil as 
defined in the fourth aspect, or a transgenic plant or animates defined in the fifth or sixth 
aspects, to produce a supernatant comprising the polypeptide, and optionally (b) recovering 
the polypeptide. 

Other aspects of then invention relate to; an animal feed additive comprising at least 
1 0 one polypeptide as defined In the first aspect; and 

(a) at least one fat-soluble vitamin, and/or 

(b) at least one water-soluble vitamin, and/or 

(c) at least one trace mineral; 

an animal feed composition having a crude protein content of 50 to 800 g/kg and 
15 comprising at least one polypeptide as defined In the first aspect, or at least one feed additive 
of the previous aspect; 

a composition comprising at least one polypeptide as defined in the first aspect, 
together with at least one other enzyme selected from amongst phytase (EC 3.13,8 or 
3,1.3.28); xylanase {EC 3.2 I B); galacfanase (EC 3.2,1.89); aipha-galactosidase (BC 
20 3.2.122); protease (EC 3,4.-,-), phosphollpase A1 (EC 3.1.132); phospholipase M (EC 
3.11.4); iysophosphoiipase (EC 3,1.15); phosphollpase C (3.14,3); phosphollpase D (EG 
3.1,4.4); and/or beta-glucanase (EC 3,2.14 or EC 3,2.16); 

a method for using at least one polypeptide as defined In the first aspect, for improving 
the nutritional value of an animal feed, for increasing digestible and/or soluble protein in animal 
25 diets, for increasing the degree of hydrolysis of proteins In animal diets, and/or for the 
treatment of vegetable proteins, the method comprising Including the polypeptide^) in animal 
feed, and/or in a composition for use In animal feed; 

a method for using at least one polypeptide as defined In the first aspect, comprising 
including the polypeptides) in a detergent formulation. 

30 

DETAILED DESCRIPTION OF THE SNVEfcTBO^ 

Proteases are classified on the basis of their catalytic mechanism info the following 
groups; Sonne proteases (8), -Cysteine- proteases (C). Aspartic proteases (A), 
Metailoproteases and yntaowm or as yet unclassified, proteases (U), see Handbook of 
36 Proteolytic Enzymes, A.J.Barrett, M.D,Kawllngs. J,F,W0essner (eds), Academic Press (1993), 
in particular the general introduction pad. 

4 
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Serine proteases are ublqM$3Ji#, -l^hf Jfcund. Irrviftises, oacteria and eukaryotes; they 
include axopepfidase, endopeptldase, oBgopepidase and omega-pepiidase activity. Over 20 
families (denoted SI - 32?) of serine proteases have been !dentrfie4 fliese being grouped Into 
8 clans denoted SA, SB, SC, SB, and SG, m the basis of structural similarity and 
5 functional evidence (Barrett et at 1898. Handbook of proteolytic enzymes). Structures are 
known for at least four of the clans (SA S SB, SO and SE), these appe&c to be totally unrelated, 
suggesting at least four evolutionary origins of serine peptidases. Aipha-iytis endopeptfdases 
belong to the chymofrypisln (SA) clan, within which they have been assigned to subfamily A of 
the 52 family (32 A). 

10 Another classification system of proteolytic enzymes is based on sequence information, 

and is therefore used more often in the art of molecular biology; it Is described in Rawiings, 
N.D. at ai„ 2002, MEROPS: The protease database. Nucleic Acids Res. 30:343-348. The 
MEROPS database is freely available electronically at hft p^Avww.m etcDS,aauk t According to 
the MEROPS system, the proteolytic enzymes classified as S2A in The Handbook of 

15 Proteolytic Enzymes', are in MEROPS classified as 'Si E' proteases (Rawiings ND, Barrett AJ< 
(1093).Bvolutionary families of peptidases, Blochem. J. 290:205-218). 

in particular embodiments, the proteases of the Invention and for use according to the 
invention are selected from the group consisting of; 
(a) proteases belonging to the EC 3.4.--.- enzyme group; 

20 (b) Serine proteases belonging to the S group of the above Handbook: 
(el) Serine proteases of peptidase family S2A; 

(c2) Serine proteases of peptidase family SI E as described in Biechem.J. 290:205-218 
(1993) and In MEROPS a protease database, release 8.20, March 24, 2003, 
(www.rherops.ac.uk). The database is described in Rowlings, N.D., O'Brien, E, A. & Barrett, 

25 A. J. (2002) MEROPS: the protease database. Nucleic Acids Res. 30, 343-348. 

For determining whether a given protease is a Serine protease, and a family S2A 
protease, reference is made to the above Handbook and the principles Indicated therein. Such 
determination can be carried out for ail types of proteases, be it naturally occurring or wild-type 
proteases; or genetically engineered or synthetic proteases. 

30 Protease activity can be measured using any assay, in which a substrate is employed, 

that includes peptide bonds relevant for me specificity of the protease In question. Assay-pH 
and assay-temperature are likewise to he adapted to the protease in question. Examples of 
assay-pH-values are pH 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12. Examples of assay-temperatures 
are 30, 35, 37, 40, 45, 50, 55 s 80.85, 70, 80, 90, or src, 

35 Examples of protease substrates are casein, such as Azurine-CrossSinked Casein 

(A2CL~casaln). Two protease assays are described in Example 2 herein, either of which can 
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be used to determine protease activity. For: the purposes of this invention, the so-called pNA 
Assay is a preferred assay. 

There am no limitations on the origin of the protease of the invention ami/or for use 
according to the invention:. Thus, the fern's protease Includes not only natural or wild-type 
5 proteases obtained from microorganisms of any genus, but also any mutants, variants, 
fragments etc. thereof exhibiting protease activity, as well as synthetic proteases, such as 
shuffled proteases, and consensus proteases.; Such genetically engineered proteases can be 
prepared as is generally known In the ad, eg by Site-directed Mutagenesis, by PCR (using a 
PCR fragment containing the desired mutation as one of the primers In the PCR reactions), or 

10 by Random Mutagenesis, The preparation of consensus proteins is described in eg EP 
897986. The term "obtained from* as used herein in connection with a given source shall mean 
that the polypeptide encoded by the nucleic add sequence is produced by the source or by a 
cell in which the nucleic acid sequence from the source is present. In a preferred embodiment, 
the polypeptide is secreted exbaceiiulany. 

is In a specific embodiment, the protease is a low-aliergenlc variant designed to invoke a 

rfc&ueed Immunological response when exposed to animals, including man* The term 
Immunological response Is to be understood as any reaction by the Immune system of ah 
animal exposed to the protease. One type of immunological response is an allergic response 
leading to increased levels of IgE in the exposed animal, Low-silsrgenic variants may be 

20 prepared using techniques known in the ail For example the protease may be conjugated with 
polymer moieties shielding portions or epitopes of the protease involved In an immunological 
response. Conjugation with polymers may Involve in v/fr» chemical coupling of polymer to the 
protease, e.g. as described in WO 98/17929, WO 98/30682, WO 98/35028. and/or WO 
99/00489. Conjugation may In addition or alternatively thereto Involve in vivo coupling of 

.25 polymers to the protease. Such conjugation may be achieved by genetic engineering of the 
nucleotide sequence encoding the protease, inserting consensus sequences encoding 
additional giycosylatlon sites in the protease and expressing the protease in a host capable of 
glycosylating the protease, see e.g. WO Q0/283S4. Another way of providing iow-allergenie 
variants is genetic engineering of the nucleotide sequence encoding the protease so as to 

30 cause the protease to self-oligomerim effecting that protease monomers may shield the 
epitopes of other protease monomers and thereby lowering the antigenicity of the oligomers. 
Such products and their preparation Is deschbed e.g. in WO 98/161 77, Epitopes involved in an 
Immunological response may be identified by various methods such as the phage display 
method deschbed in WO GO/2.8230 and WO 01/83559, or the random approach described In 

35 EP §81907. Once an epitope has been identified, its amino acid sequence may be altered to 
produce altered Immunological properties of the protease by known gene manipulation 
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techniques such as site directed mutagenesis {see e.g. WO 00/26230, WO 00/28354 and/or 
WO 00/22103) and/or conjugation of a polymer may foe done in sufficient proximity to the 
epitope for the polymer to shield the epitope. 

The first aspect of the invention palates to a secreted polypeptide which has protease 
5 activity, preferably aipha-lytic endopeptsdase activity, which polypeptide comprises at least 
three non-polar or uncharged polar amino acids within tee last four amino acids of the C- 
terminus of the polypeptide; and which polypeptide; 

(a) comprises an amino acid sequence which is at least 70%, or preferably 75%, 30%, 
85%, 88%, 87%, 88%, 89%, 00%, 91%, 92%, 93%, 94%, 95%, 88%, 87%, 98%, or 

10 99% identical to the amino acid sequence of the mature part of the polypeptide 

shown in SEQ ID NO: 28; SEQ ID HO; 33; SEQ ID NO: 37; SEQ ID NO; 41; SEQ 
ID NO; 43; or SEQ ID MO; 45; 

(b) comprises an amino acid sequence which is at least 70%, or preferably 75%, 80%, 
85%, 86%, 87%, 88%, 89%, 90%, 01%, 02%, 93%, 94%, 95%, 98%, 97%, 98%, or 

15 99% identical to the amino acid sequence of the the mature part of the polypeptide 

encoded by the polynucleotide in SEO ID HO: 1; SEQ ID NO; 2; SEQ ID NO; 25; 
SEQ ID NO: 31; SEQ ID NO: 32; SEQ ID NO; 38; SEO ID NO; 40; or SEO. ID NO; 
44; 

(0) is encoded by a nucleic aoid sequence which hybridizes under very km, low, 



20 medium-low, medium, medium-high, high, or very high stringency conditions with; 

(!) a polynucleotide encoding the mature part of a protease, said 
polynucleotide obtainable from genomic DMA from Nomr&opsis 
dassomlliei subsp. dassotwM-i DSM 43235 by use of primers SEQ ID 
NO'S: 29 and 27; from Nocareffopsis Alba DSN! 15847 by use of primers 

26 SEQ ID NO's; 34 and 35; from NocsMopsfs pmsine DSM 1 5848 by use 



of primers SEQ ID NO's: 38 and 39; or from Nocardiasis pmssm DSM 
1 5849 by use of primers SEQ ID NO's: 42 and 39; 
01) the polynucleotide of SEQ 50 NO: 1; of SEQ ID NO: 2; of SEQ ID NO: 
25; of SEQ ID NO: 81: of SEQ ID NO; 32; of SEO ID NO; 38; of SEQ ID 
30 NO: 40; or of SEO ID NO: 44; 

(Hi), a subsequence of (!) or (II) of at least 500 nucleotides, preferably 400, 

300, 200, or 100 nucleotides, or 
(IV) a complementary strand of ■$), (II), or (III); 
(d) comprises a mature part which Is a variant of the mature part of the polypeptide 
3S having the amino sold sequence of SEQ ID NO: 28; SEQ ID NO: 33; SEQ ID NQ: 
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37; SEQ ID NO: 41: SEQ IP HO: 43; of BE© ID NO; 45, comprising a substitution, 
deletion, extension, and/orirss^rSon of oneor mom amino adds; 
{©) is an allelic variant of (a), (b), (o% 0f (#; or 
(!) is a fragment of (a), .'{&)., (d), or (a). 
S For the purposes of the present invention, the degree of identity between two amino 

acid sequences, as well as the degree of identity between two nucleotide sequences, is 
determined by the program "align" which is a Needleriian-Wunsch alignment (La. a global 
alignment). The program Is used for alignment of polypeptide, as well as nucleotide 
sequences- The default scoring matrix 8LOSUM50 Is used for polypeptide alignments, and the 
10 default Identity matrix is used for nucleotide alignments. The penalty for the first residue of a 
gap is -12 for polypeptides and ~18 for nucleotides. The penalties for further residues of a gap 
are -2 for polypeptides, and ~4 for nucleotide. 

"Align" Is part of the FASTA package version v20u8 (see W. ft Pearson and D. J. 
Upmm (IMS), "Improved Tools for Biological Sequence Analysis", PNAS 86:2444-2448, and 
IS W. R, Pearson (1990) "Rapid and Sensitive Sequence Comparison with FASTF and PASTA,' 1 
Methods in Ertzyroology 183:63-98), FASTA protein alignments use the Smith-Wafefrnap 
algorithm with no limitation on gap size (see "Smith-Waterman algorithm 5 ', t. F, Smith and M, 
S. Waterman (1981) J, .Mol. Biol. 147:1&5-197>, 

The degree of identity between two amino acid sequences may also he determined by 
20 the Clustal method {Biggins, 1989, CABIOS 5: 151-153) using the LASERGHNE 7 ** 
MSGALIGN* software (DNASTAR, Inc., Madison, Wi) with an identity table and the following 
multiple alignment parameters: Gap penalty of 10, end gap length penally of 1 0, Pairwise 
alignment parameters are Kfupls~1 , gap penaity-S, windows-S, and dlagonals«S. The degree 
of identity between two nucleotide sequences may be determined using the same algorithm 
26 and software package as described above with the following settings: Gap penalty of 10, and 
gap length penalty of 10. Pairwise alignment parameters are Ktupie~3< gap penaify~3 and 
windows~20. 

A fragment of one of the encoding polynucleotide sequences of the invention is a 
polynucleotide which encodes a polypeptide having one or more amino acids deleted from the 

39 amino and/or carboxyl terminus compared to the ryfMengih amino acid sequence. In one 
embodiment a fragment encodes at least 76' amino acid residues, or at least 100 amino add 
residues, or at least 125 amino acid residues, or at least Top amino acid residues, or at least 
160 amino add residues, or at least 185 amino acid residues, or at least 170 amino acid 
residues, oral teast 175 amino acid residues. 

36 An allelic variant denotes any of two or mare after rtatrVe forms of a gene occupying the 

same chromosomal locus. Asiatic variation arises naturally through mutation, and may result in 
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polymorphism within populations. Gem mutations can be silent (no change in the encoded 
polypeptide) or may encode pohfpapidas having altered amino acid sequences. An allelic 
variant of a polypeptide is a polypeptide encoded by am aiieiSe variant of a gene. 

The present Invention also relates: to Wated polypeptides having protease activity and 
5 which are encoded by nucleic acid; sequences which hybridize under very low, or low, or low- 
medium, medium, medium-high, high, or very high stringency conditions with a nucleic acid 
probe which hybridizes under the same conditions with (!) a polynucleotide encoding a 
protease obtainable from genomic DNA from NmmWopsis dassonvillet subsp. dasmnviifei 
DSM 43235 by use of primers SEQ 10 NO's: 28 and 27; from Nocardiopsis Alba DSM 1S847 

10 by use of primers SEQ ID NO's; 34 and 35; from Nocardiopsis prasina DSM 15848 by use of 
primers SEQ ID NO's: 38 and 39; or from mmrdmpsm pr&sina DSM 18049 by use of primers 
SEQ ID NO's: 42 and 39; (II) the polynucleotide of SEQ ID NO; 1; of SEQ ID NO; 2; of SEQ ID 
NO: 25; of SEQ ID NO; 31: of SEQ ID NO: 32: of SEQ ID NO: 38: of SEQ ID NO: 40; Of of 
SEQ. ID NO; 44; (III) a subsequence of <i) or (il) of at least 500 nucleotides, preferably 400, 

15 300, 200, or 100 nucleotides, or (IV) a complementary strand of (I), (II), or {III) (d. Sambrook, 
E;F, Ffitseh, and T. Maniatls, 1989, MoteGuttzr Cloning, A laboratory WanmL 2nd edition, Cold 
Spring Harbor, New York). In one particular embodiment the nucleic acid probe is selected 
from amongst the nucleic acid sequences of (a), (i>% or (c) above. A polynucleotide 
corresponding to the mature peptide encoding pari of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID 

30 NO; 25, SEQ: ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 38, SEQ ID NO: 40, or SEQ ID HQ: 44 
is a preferred probe, 

The nucleic acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 25, SEQ ID 
NO: 31, SEQ ID NO: 32, SEQ ID NO; 38, SEQ ID NO: 40, or SEQ ID NO: 44, or a 
subsequence, 'thereof, m well as the amino acid sequences of SEQ ID NO; S3, SEQ ID NO: 

25 33, SEQ ID NO; 37, SEQ ID NO: 41, SEQ ID NO: 43, or SEQ ID NO; 45, or a fragment 
thereof, and even a genomic polynucleotide encoding a protease obtainable from genomic 
DNA from Nocardiopsis dassomill&i aubsp, dassonvfflei DSM 43236 by use of primers SEQ !P 
NO's: 28 and 27; from Nocardiopsis Alba mm 18847 by use of primers SEQ ID NO's: 34 and 
38: from Nocardiopsis prasina DSM 18840 by use of primers SEQ ID NO's: 38 and 39: or from 

30 Nd&ardiopm prasina DSM 15849 by use of primers SEQ ID NO's: 42 and 39, or a 
subsequence thereof, may be used to design a nucleic add probe to identify and done DNA 
encoding polypeptides having protease activity from strains of different genera or species 
according to methods well known in the art. In particular, such probes can be used for 
hybridisation with the genomic or cDNA of the genus or species of interest, following standard 

38 Southern blotting procedures, In ostler to identify and isolate the corresponding gene therein. 
Such probes can be conskterabfy shorter than the entire sequence, but should be at least 15, 
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preferably at least 26, and more preferably at feast 35 nucleotides In length. Longer probes 
can also be used. Both DMA and RNA probes can be used. The probes are typically labeled 
for detecting the corresponding gene (for example, with n P, 3 H, blofln, or ayidln). Such 
probes are encompassed by the present Invention. 
5 Thus, a genomic DMA or cDNA library prepared from such other organisms may he 

screened for DMA that hybridises with the probes described above and which encodes a 
polypeptide having protease activity. Genomic or other DMA from such other organisms may 
be separated by agarose or polyacrylanilde gel electrophoresis, or other separation 
techniques, DNA from the libraries or the separated DNA may be transferred to and 

10 immobilized on nitrocellulose or other suitable earner material, in order to identify a clone or 
DNA which is homologous with SEQ iO NO; 1 or a subsequence thereof, the carrier material Is 
used in a Southern blot. For purposes of the present invention, hybridization indicates that the 
nucleic acid sequence hybridizes to a labeled nucleic add probe corresponding to the nucleic 
acid sequence shown in SEQ ID NO; 1, its complementary strand, or a subsequence thereof, 

IS under very low to very high stringency conditions. Molecules to which the nucleic acid probe 
hybridizes under these conditions are detected using X-ray film, 

For long probes of at least 100 nucleotides in length, very low to very high stringency 
conditions are defined as prehyferidization and hybridization at 42-Cih 5X SSFE, 0.3% SOS, 
200 y&ml sheafed and denatured salmon sperm DNA, and either 25% fprmamide for very lew 

2o and km stringencies. 38% formamlde for medium and medium-high stringencies, or BQ% 
formarriide for high and very high stringencies, foi lowing standard Southern blotting 
procedures. 

For long probes of at least 100 nucleotides in length, the carrier materia! is finally 
washed three times each for 1S minutes using G.2 x SSC, 0.2% SOS, 20% formamlde 

25 preferably at least at 45*C (very low stringency), more preferably at least at SO'C (low 
stringency), more preferably at least at 5S*C (medium stringency), more preferably at least at 
60*C (medium-high stringency), even more preferably at least at 8'8 5 C {high stringency), and 
most preferably at least at 70*0 (very high stringency). 

For short probes about 15 nucleotides to about 70 nucleotides in length, stringency 

3D conditions are defined as prehybddlzation, hybridization, and washing post~hybridizat!on at S*C 
to 10*0 below the calculated T m using the calculation according to Bolton and McCarthy (1982, 
Proceedings of toe National Academy of Sciences USA 4B:i300| in 0.9 M NaCI, 0.09 M Trls- 
HOI pH 7.8, 6 mM EDTA, 0,5% NF-4Q, IX Denhatdf s solution, 1 mU sodium pyrophosphate, 1 
mM sodium monobasic phosphate, 0.1 mM ATP, and 0,2 nig of yeast RNA per ml following 

35 standard Southern blotting procedures. 



10 



WO 2#4/.U .1.2*9 



FCr/»K2lN>4/tf*KH3I 



For short probes about 15 nucleotides to about 70 nucleotides in length, the carrier 
materia! is washed once in 6X SSC plus Q>1% SOS for 15 minutes and twice each for 15 
minutes using SX SSC at S*C to 10*C hatow the <»toiated 1W 

The present invention also relates to variants of the polypeptide of the invention, 
§ comprising a substitution, deletion, and/or insertion of one or more amino acids. 

in a particular embodiment, amino acid changes are of a minor nature, that is 
conservative amino acid substitutions fiat do not significantly affect the folding and/or activity 
of the protein; small deletions, typically of one to about 30 amino acids; small amino- or 
carhoxyi-terminal extensions, such as an ammo-terminal methionine residue; a small peptide 
10 of up to about 20-25 residues; or a small extension that facilitates purification by changing net 
charge or another function, such as a poiy-histidine tract an antigenic epitope of a binding 
domain, 

Examples of conservative substitutions are within the group of basic amino acids 
(arginine, lysine and hlstidine), acidic amino acids {glutamic acid and aspa'r&c acid), polar 

15 amino acids (glutamina and asparagina), hydrophobic amino acids (leucine, isoleucine and 
valine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), and small amino acids 
(glycine, alanine, serine, threonine and methionine), Amino acid substitutions which do not 
generally alter the specific activity are known In the art and are described, for example, by H. 
ffeyraih and R.L Hill, 1979, In, The Protons, - Academic Press, New York, The most commonly 

20 occurring exchanges are Ala/Ser, Val/lle, Asp/Glu, Trtr/Sep Als/Gly; Ala/Thr, Ser/Ash, Ai&A/ai, 
Ser/Giy, Tyr/Phe, Ala/Fro, tys/Arg, Asp/Asn, Leu/ila, LeuA/al Aia/Glu, and Asp/Gly as well as 
these in reverse, 

in a particular embodiment, the polypeptides of the invention and for use according to 
the invention are acid-stable. For the present purposes, the term acid-stable means that the 

25 residual activity after 2 hours of incubation at pH 3,0 and 37*0, is at least 50%, as compared to 
the residual activity of a corresponding sample incubated for 2 hours at pH 9,0 and 5°C. in a 
particular embodiment, the residual activity is at least 60%, 70%, 80% or at least 90%. 

in particular embodiments, the polypeptide of the invention is i) a bacterial protease; ii) 
a protease of the phylum Aciinobaci&m; i) of the class Actimbaotem: iv) of the order 

30 ActinomycBtaies v) of the family NoQafdiopsamm^ vl) of the genus Nocardiasis; and/or a 
protease derived from vil) Nocardmpsm species such as Noomdhpsis alba, Nomnihpsm 
antarcilca, Nocardmpsis pragma, mmposia, mhalms t imhphila, h&iQioi&ram, kumammis, 
/M®4 iwentensis* awtaiilcus, sym&mata^rmam, tmPatdsi, tropica, umidisoholae, 
xmpangemis t ov Nocardiasis dasmnvill'ei, M&mmpl®NmafdiQp$m dassonviUei DSM 43235. 
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The above taxonomy is according to the chapter; The road map to the Manual by QM, 
Canity & J. G. Holt in Sergey's Manual of Systematic Bacteriology, 2001 , second edition, 
volume 1, David Ft Sons, Richard W. Castenhofe, 

it will be understood that for the aforemebtioned species, the invention encompasses 
§ both the perfect and Imperfect states, and other taxonomfo ."-equivalents, e.g., anamorphs. 
regardless of the spades name by which they are known. Those skilled in the art will readily 
recognise the identity of appropriate equivalents. 

Strains of these species are readily accessible to the public in a number of culture 
collections, such as the American Type Culture Collection (ATCC), Deutsche Samrrslung von 
10 yikroorganismen und Zellkulturen GmbH :(DS&!)« Cenfraal bureau Voor SchimmaloMltures 
(CBS), and Agricultural Research Service Patent Culture Collection, Northern Regional 
Research Center (NRRL). E.g.. Nocardiasis tiassanviHei sabsp. dassonvHIei DSM 43235 is 
publicly available from DSMZ (Deutsche Sarornlung von MSkrooryanisman und Zeilkulturen 
GmbH, Braunschweig, Germany). The strain was also deposited at other depositary 
1 6 institutions as fellows: ATCC 2321 8, MW 1 250, NCTC 1 0489. 

Furthermore, such polypeptides may be Identified and obtained from other sources 
Including microorganisms isolated from nature (<s<g. t soil; composts, wafer, etc.) using the 
above-mentioned probes. Techniques for Isolating microorganisms from natural habitats are 
well known in the art. The nucleic acid sequence may then be derived by similarly screening a 
20 genomic or oDMA library of another microorganism. Once a nucleic acid sequence encoding a 
polypeptide has bean detected with the probers), the sequence may he isolated or cloned by 
utilizing techniques which are known to those of ordinary skill In the art (see, e.g. , Bambrook or 
a/.. 1980, supte}. 

As defined herein, an "isolated* polypeptide is a polypeptide which is essentially free of 
28 other polypeptides, e.g., at least about .20% pure., preferably at least about 40% pure, more 
preferably about 80% pure, even more preferably about 80% pure, most preferably about 90% 
pyre, and even most preferably about 85% pure, as determined by SDS-PAGE. 

Polypeptides encoded by nucleic acid sequences of the present invention also include 
fused polypeptides or cleavable fusion polypeptides in which another polypeptide Is fused at 
30 the N-termsnus or the C~termin«s of 'the polypeptide or fragment thereof . A fused polypeptide is 
produced by fusing a nucleic acid seguenee (or a portion thereof) encoding another 
polypeptide to a nucleic acid sequence (or a portion thereof} of the present invention. 
Techniques for producing fusion polypeptides are known in the art, e.g. PGR, or ligatlng the 
coding sequences encoding the polypeptides; so that they are in frame and that expression of 
3S the fused polypeptide Is under contra! of the same promoter(s) and terminator, 
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tn the present context, non-polar amino acids are G, A, Y, L. [, M, P, F or W; and 
uncharged polar amino adds are B> X } N, Q, % og C. The terms "non-polar* and ^uncharged 
polar" when used to describe amino acids in a polypeptide are generally recognized in the art 
ae characterising the sii^eNatn ph^'tfSW;.H«tt&. feHmtance, the free csrboxylie acid 
5 of the o-fsrminal amino add In a polypeptide Is not considered when determining whether this 
amino acid is a non-polar or uncharged polar amino acid. 

A preferred embodiment releates to a polypeptide of the first aspect which mature part 
is a wlldtype polypeptide; an artificial variant of a wildtyp® polypeptide said variant having one 
or more aminQ-acid(s) added to the C-terminus as compared to the wiidtype and preferably the 

10 one or mom added amino ackKs) h (are) non-polar or uncharged and even more preferably 
the one or more added amino acld(s) is one or more of O, S, V, A, or P; a shuffled polypeptide; 
or a protein-engineered polypeptide. 

Another preferred embodiment relates to a polypeptide of the first aspect, wherein the 
one or more, added amino acids are selected from the group consisting of; QSHVQSAP, 

15 OSAf* QP, 71, TT, QL, TP, LP, Ti, (Q, QP, Pi, LT, TO, it, QQ, and PQ, 

The Inventors determined, that the polypeptides of the present Invention were produced 
in even greater yields when they were expressed as mature proteases fused to a heterologous 
pro~fi3glon ( as shown in the examples below. 

Accordingly, a preferred embodiment relates to the polypeptide according to the 

20 first aspect which when expressed and before maturation comprises a heterologous pro-region 
from a protease; preferably the pro-region is derived from an S2A or S1E protease, more 
preferably the pro-region Is encoded by a polynucleotide which hybridizes under very low, low, 
medium-low, medium, medium-high, high, or very high stringency conditions with a 
polynucleotide encoding the pro-region shown In position -168 to -1 of SEQ ID NO: .28, in 

25 position 1-188 of SEQ ID NO: 30, in position -187 to -1 of SEQ ID NO: 33, in position -185 to -1 
of SEQ ID NO: 37, in position -165 to ~1 of SEQ ID NO; 41, in position -165 to -1 of SEQ ID 
NO: 43, In position -185 to -1 of SEQ ID NO: 46, in position 1-185 of SEQ ID NO: 48, in 
position 1-166 of SEQ ID NO; 47, in position 1-168 of SEQ ID NO: 48, In position 1-168 of 
SEQ ID NO: 49, in position 1-188 of SEQ ID HQ; 50, In position 1-186 of SEQ ID NO: 51, in 

30 position 1-1 83 of SEQ ID NO: 52, or in position 1-186 of SEQ ID NO: 53; and most preferably it 
is at least 70% identical, or preferably 75%, 80%, 85% s 88%, 87%, 88%, 83%, 90%, 91%, 
92%, 93%, 94%, 95% 5 96%, 97%, 96%, or M%, Identical to the pro-region shown in position - 
188 to -1 of SEQ ID NO: 28, In position 1-188 of SEQ ID NO: 30, in position -187 to -1 of SEQ 
ID NO: 33, in position -188 to -1 of SEQ ED NO; 37, b position -186 to -1 of SEQ ID NO: 41 , in 

3S position -165 to -1 of SEQ ID NO: 43, in position -165 to -1 of SEQ ID NO: 45, in position 1-165 
of SEQ ID NO: 46. in position 1-188 of SEQ ID NO: 47, in position 1-188 of SEQ ID NO: 48, in 
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position 1-166 of SEQ ID NO: 49, in position 1-166 of SEQ ID MO: 60, in position 1-166 of 
SEQ ID NO: 51, in position 1-168 of SEQ IP NO: 52, or in position 1-166 of SEQ ID NO: S3, 

When the particular- Q4erninal amino add configuration of the polypeptide of the 
Invention was combined with an heterologous secretion signal peptide fused to the N-terminal 
S part of the polypeptide of the invention,: a synergy was achieved and a greater yield resulted. 

Accordingly, a preferred embodiment of the Invention relates to the polypeptide of the 
first aspect which when expressed comprises a heterologous secretion signal-pepfide which is 
cleaved from the polypeptide when the polypeptide Is secreted, preferably the heterologous 
secretion signal peptide is derived from a heterologous protease; preferably the heterologous 
10 secretion signal peptide comprises an amino add sequence having a sequence identity of at 
least 70%, or preferably 75%, 80%, 85%, 88%, 87%, 88%, 89%, 90%, 91%. 92%, 93%, 94%, 
95%, 98%, 87%, 98%, or 99%, with the amino acid sequence encoded by polynucleotides 1 ~ 
m of SEQ ID NO: 2, or SEQ ID NO: 44. 

15 ftjucfejcAcid Se quences 

The present invention also relates to isolated nucleic acid sequences that encode a 
polypeptide of the present Invention. Particular nucleic acid sequences of the invsnitf^'iifs the 
polynucleotides of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO; 25, SEQ ID NO; 31, SEQ ID 
MO; 32, SEQ ID NO: M, SEQ ID NO: 40, or SEQ ID NO: 44, Another particular nucleic acid 

20 Sequence of the Invention Is the sequence, preferably the mature polypeptide encoding region 
thereof, which is obtainable from genomic DNA from Noc&trlfapsts dassonvHM suhsptams 
ii&s&onvitfm DSM 43235. The present invention also encompasses nucleic acid sequences 
which encode a polypeptide having the amino acid sequence of amino acids shown in 
positions 1 to 188, or positions -188 to 188, of SEQ ID NO: 43, which nucleic acid sequences 

25 differ from the corresponding parts of SEQ ID NO: 1 by virtue of the degeneracy of the genetic 
code. The present invention also relates to subsequences of of the above polynucleotides 
which encode polypeptide fragments that have protease activity. 

A subsequence of a polynucleotide Is a nucleic acid sequence from which one or more 
nucleotides from the 5" and/or 3' end has been deleted. Preferably, a subsequence contains at 

30 least 225 nucleotides, more preferably at least 300 nucleotides, even more preferably at least 
375, 450, 500, 531, 800, 700, 800, 000 or 1000 nucleotides, The present Invention also relates 
to nucleotide sequences which have a degree of identity to the polynucleotide of SEQ ID NO: 
1. SEQ ID NO; 2, SEQ ID NO: 25, SEQ ID NO: 31, SEQ ID NO; 32, SEQ ID NO: 36, SEQ ID 
NO: 40, or SEQ ID MO: 44 of at least 85%, 88, 87, 88, 89, 90, 91, 92. 93, 94, 95, 98, 97, 98, or 

35 at least 99%, 
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The techniques used to isolate or clone a hueiefo acid sequence encoding a 
polypeptide are known in the art and include isolation torn genomic DMA, preparation from 
eDNA, or a combination thereof. The cloning of the nuoiefe add sequences of the present 
invention from such genomic DMA can fee effected, by using the well known polymerase 
5 chain reaction (PGR) or antibody screening of expression libraries to detect cloned DMA 
fragments with shared structural features. See, &.g<, Innls et a/,, 1990, PCR: A Guide to 
M&ihads and Application, Academic Press, Hew York. Other nucleic acid amplification 
procedures such as ilgase chain reaction (ICR), ligated activated transcription (LAI) and 
nucleic acid sequence-based amplification (NAS8A) may be used. The nucleic acid sequence 
10 may be cloned from a strain of Nocan$ops!s or another or related organism and thus, for 
example, may be an allelic or species variant of the polypeptide encoding region of the nucleic 
acid sequence. 

The term Isolated nucleic acid sequence" as used herein refers to a nucleic acid 
sequence which is essentially free of other nucleic add sequences, eg., at least about 20% 

Id pure, preferably at least about 40% pure, more preferably at least about 80% pure, even more 
preferably at least about 80% pUm s and most preferably at least about 90% pure as 
determined by agarose electrophoresis. For example, an isolated nucleic acid sequence can 
be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic 
acid sequence from Is miuml location to a different site where ft will be reproduced. The 

20 cloning procedures may Involve excision and isolation of a desired nucleic acid fragment 
comprising the nucleic acid sequence encoding the polypeptide, Insertion of the fragment into 
a vector molecule, and incorporation of the recombinant vector into a host cell where multiple 
copies or clones of the nucleic acid sequence will be replicated. The nucleic acid sequence 
may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof. 

28 Modification of a nucleic acid sequence encoding a polypeptide of the present invention 

may be necessary for the synthesis of polypeptides substantially similar to the polypeptide. 
The term "substantially similar' to the polypeptide refers to non-naturaliy occurring forms of the 
polypeptide. These polypeptides may differ m some engineered way from the polypeptide 
Isolated from its native source, e.g., variants that differ in specific activity, thermostability, pH 

30 optimum, aliergenlolty, or the like. The variant sequence may be constructed on the basis of 
the nucleic acid sequence presented as the polypeptide encoding part of the polynucleotides 
of the invention, e.g. a subsequence mereof, and/or by Introduction of nucleotide substitutions 
which do not give rise to another amino acid sequence of the polypeptide encoded by the 
nucleic acid sequence, but which correspond to the eodon usage of the host organism 

38 intended for production of the; protease, or by introduction of nucleotide substitutions which 
may give rise to a different amino add sequence. For a genera! description of nucleotide 
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substitution, see, e.g., ford &f al, 19S1 Protein Expfemian atitf .Purification 2: 95-107. Low- 
allergenic polypeptides can e.g. be prepared as described aecva. 

It will be apparent to those skied in Hie art that such substitutions can be made outside 
the regions critical to the function of the molecule and still result in an active polypeptide, 
s Amino add residues essential to the activity of the polypeptide encoded by the Isolated nucleic 
acid sequence of the Invention, and therefor® preferably not subject to substitution, may be 
identified according to procedures known In: the art, such as site-directed mutagenesis or 
alanine-soanniog mutagenesis (see, e.g., Cunningham and Weils, 1989, Stisnce 244: 1081- 
108S). in the latter technique, mutations are introduced at every positively charged residue In 

10 the molecule, and the resultant mutant molecules are tested for protease activity to identify 
amino acid residues that are critical to the activity of the molecule. Sites of substrate-protease 
interaction can also be determined by analysis of the three-dimensional structure as 
determined by such techniques as nuclear magnetic resonance analysis, crystallography or 
photoaffiniiy labelling (see, e.g., de Ybseta/,, 1992, Betimes 255: 308-312; Smith MM, 1992, 

IS Journal of Molecular Biology 224: 898-904; Wiorfaver et aL 1 992. FEBS Letters 80S: §9-84). 

The present Invention also relates to Isolated nucleic acid sequences encoding a 
polypeptide of the present invention, which hybridize under very low stringency conditions, 
preferably Sow stringency conditions, more preferably medium stringency conditions, more 
preferably medium-high stringency conditions, even more preferably high stringency 

20 conditions, and most preferably very high stringency conditions with a nucleic acid probe which 
byfeHd&es under the same conditions with the nucleic acid sequence of the invention or its 
complementary strand; or allelic variants and subsequences thereof (Samhrook et aL, 1980, 
supra), as defined herein, 

The present invention also relates to Isolated nucleic acid sequences produced by 

25 (a) hybridizing a DHA under very low, -tern, medium, medium-high, high, or very high stringency 
conditions with (J) a polynucleotide of the invention, (is) a subsequence of (I), or (ill) a 
complementary strand of 0), or (it); and (h) Isolating the nucleic acid sequence. The 
subsequence is preferably a sequence of at least 100 nucleotides suet? as a sequence that 
encodes a polypeptide fragment which has protease activity. 

30 The introduction of a mutation Into the nuctete acid sequence to exchange one 

nucleotide for another nucleotide may be accomplished by site-directed mutagenesis using 
any of the methods known in the art. Particularly useful is the procedure that utilizes a 
supercolied, double stranded PNA vector with an insert of Interest and two synthetic primers 
containing the desired mutation. The oilgonucieotide primers, each complementary to opposite 

35 strands of the vector, extend during ^i^to ^dlWfey means of Pfu QUA polymerase. On 
incorporation of the primers, a mutated plasmsd containing staggered nieks is generated. 
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Following temperature- cyding,.#i&pexj^ & feealpdwith. Dpnt which Is specific for methylated 
and hemlmethylated DMA to digest the pararstai DNA template and to select for mutation- 
containing synthesized DMA, Other procedures known In the art may also he used. The 
Invention also relates to an isolated poiynyoleofde encoding a polypeptide as defined in the 
S first aspect. 

Nucleic Acid Constructs 

The present invention also relates to nucleic add constructs comprising a nucleic acid 
sequence of the present invention operably linked to one or more control sequences that direct 

10 the expression of the coding sequence in a suitable host cell under conditions compatible with 
the control sequences. Expression will be understood to include any step involved In the 
production of the polypeptide including, hut not limited to, transcription, pastdranscriptional 
modification; translation, post-transiational modification, and secretion. 

"Nucleic acid construct" is defined herein as a nucleic acid molecule, either single- or 

15 double-stranded, which is isolated from a naturally occurring gene or which has been modified 
to contain segments of nucleic acid combined and juxtaposed in a manner that would not 
otherwise exist In nature. The term nucleic acid construct is synonymous with the term 
expression cassette when the nucleic acid construct contains all the control sequences 
required fer expression of a coding sequence of the present invention. The term "coding 

20 sequence* is defined herein as a nucleic acid sequence that directly specifies the amino acid 
sequence of Its protein product The boundaries of the coding sequence ate generally 
determined by a ribdsome binding site (prekaryotes) or by the ATG start codon (eukarvotes) 
located just upstream of the open reading frame at the S s end of the mRNA and a transcription 
terminator sequence located Just downstream of the open reading frame at the 3 ! end of the 

25 mRNA. A coding sequence can include, but is not limited to, DMA, cDNA, and recombinant 
nucleic acid sequences. 

An isolated nucleic add sequence encoding a polypeptide of the present invention may 
be manipulated in a variety of ways to provide for expression of the polypeptide, Manipulation 
of the nucleic acid sequence prior to its insertion Into a vector may be desirable or necessary 

30 depending on the expression vector. The lechniques for modifying nucleic acid sequences 
utilizing recombinant DNA methods are weil know? in the art 

The term "control sequences" is defined herein to include all components that are 
necessary or advantageous for the expression of a polypeptide of the present invention, Each 
control sequence may be native or foreign to the nucleic acid sequence encoding the 

36 polypeptide. Such control sequences include, but are not Wted to, a leader, poiyadenylatlon 
sequence, propeptide sequence, promoter, signal peptide sequence, and transcription 
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terminator. At a minimum, the control sequences inciuofe a pmmoter, and transcriptional and 
translations! stop signals. T he control sequences may foe provided with linkers for the purpose 
of introducing specific restriction sites facilitating ligation ot the control sequences with the 
coding region of the nucleic acid sequence encoding a polypeptide, The term "operabiy linked" 
5 is defined herein as a configuration in which a -control sequence is appropriately placed at a 
position relative to the coding sequence of the QUA sequence such that the control sequence 
directs the expression of a polypeptide. 

The control sequence may be m appropriate promoter sequence, a nucleic acid 
sequence that is recognized by a host eel! for expression of the nucleic acid sequence. The 

10 promoter sequence contains transcriptional control sequences that mediate the expression of 
the polypeptide. The promoter may be any nucleic acid sequence which shows transcriptional 
activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may 
be obtained from genes encoding extracellular or Intracellular polypeptides either homologous 
or heterologous .to the host cell. 

15 Examples of suitable promoters for directing the transcription of the nucleic acid 

constructs of the present invention, especially in a bacteria! host cell, are the promoters 
Obtained from the E co&* lac operon, Stmptomyms coeflcotor agarasa gene (dagA), Bacillus 
sabffls levansuerase gene (saoS), Bacillus jtchsnffmnis aipha-amylase gene (amyt), Bacillus: 
eimmtnemiaphiips raaltogenie amylase gene (amyM), Bacillus amyloliquefmisns alpha- 

20 amylase gene (arnyQ), Bacillus iichaaiformis penicillinase gene (penP). Bacillus mbistis xyiA 
and xyiB geneSj and prokaryotie beta-laelamase gene (Vtila-Kamaroff et al, 1978, 
Proceedings of the National Academy of Sciences USA 75: 3727-3731), as well as the tap 
promoter (DeSoer et aL, 1983$ Proceedings of the National Academy of Sciences USA 80: 21- 
25), Further promoters are described in "Useful proteins from recombinant bacteria" in 

25 Scientific American, 1880, 242: 74-94; and in Samhrook et at, 1989, supra. 

Examples of suitable promoters for directing the transcription of the nucleic acid 
constructs of the present invention In a filamentous fungal host cell are promoters obtained 
from the genes for Aspergillus oryzae TAKA amylase, Rimomucor mieh&l aspartio proteinase, 
Aspergillus oiger neutral aipha-arnylase, -AsporgSlw aigar acid stable alpha-arnylase, 

30 Aspergillus niger or Aspergillus awamori glucoamylase (gfesA). Rhkomuoor umbel lipase, 
Aspergillus or/zee alkaline protease, Aspergillus oryiae those phosphate isometass, 
Aspergillus hidulans aoetamidase, and Fusetium mysporum trypsin»llke protease (WO 
96/00787), as well as the UAZ-ipi promoter (a hybrid of the promoters from the genes for 
Aspergillus mger neutral alpha-amylase and Aspargiilm oryzae those phosphate isomerase), 

35 and mutant, truncated, and hybrid promoters thereof 
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in a yeast host, useful 'pmtiWms: are obtained from the genes for Saceimromyces 
c&mvime malm® (ENO-1), Samhammym& cermsiae gaiactekloase (CAL1), 
SBcabmxxnycm cemvisfm alcsohel d^j^|r^eh^e/giyoeraldehyde-<3-pho5phaie 
dehydrogenase (ADH2/G.AP), and Sawbarawyces mmvisiae 3-phosphogjycofate kinase. 
5 Other useful promoters for yeast host coils am described by Romanes et aL, 1992, Yeast 8: 
423-488. 

The control sequence may also be a suitable transcription terminator sequence, a 
sequence recognized by a host cell to terminate transcription. The terminator sequence Is 
operahly linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any 
10 terminator which is functional In the host cell of choice may he used in the present Invention, 

Preferred terminators for filamentous fungal host cells are obtained from the genes lor 
Aspergillus oryme TAKA amylase, Aspergillus nigar giucoamyfase, Aspergillus nMulms 
anthramiate synthase, Aspergillus niger alpha-glucossdase, and Fusmum oxyspomm trypsin- 
like protease, 

16 Preferred terminators for yeast host cells are obtained from the genes for 

Saccbammyces oerevisiae enoiase, Sacchawmycm c&mvmae cytochrome C (CYCi), and 
Sacaharomyces carevis/'ae ojyceraidehyde-3-phosphata dehydrogenase, 'Other useful 
terminators for yeast host ceils are described by Romance &t at., 1992, supra. 

Preferred terminators; for bacterial host ceils, such as a Bacillus 'host- cell, are the 

20 terminators from B&ciilua iichaniformis aipha-amylasa gene (amyl), the Bacillus 
stmmtbempphilm maitogenie amylase gene (arnyM), or the Bacillus amylottgmfamem alpha*- 
amylase gene (amyQ). 

The control sequence may also he a suitable leader sequence, a noritrahaiated region 
of an mRNA which is important for translation by the host cell. The leader sequence is 

25 operahly linked to the 5' terminus of the nucleic acid sequence encoding the polypeptide. Any 
leader sequence thai Is functional in the host ceil of choice may be used in the present 
invention. 

Preferred leaders for filamentous fungal host cells are obtained from the genes for 
Aspergillus oryzaa TAKA amylase and Asp&fgfflu& nMuiaminom phosphate isomerase. 
30 Suitable leaders for yeast host celis are obtained from the genes for Saccbaromyms 

caravisiaa enoiase (EN0-1), Sacehammyms mrevisiaa 3-phosphoglycerate kinase, 
SaccharQmye&s ceremiae alpha-factor, and Sawbaromyces cerevisiae alcohol 
dehydrogenase/gfys8raldehyd§4«pfecs0iafe' ete%#c$fenase (ADH2#3AP). 

The control sequence may also be a polyadenyMon sequence, a sequence operabiy 
35 linked to the 3' terminus of the nucleic add sequence and which, when transcribed, Is 
recognized by the host cell as a signal to ..add poiyadenoslhe residues to transcribed mRNA. 

19 
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Any pslyadenylatton sequence which is functional In the host ceil of choice may be used in the 
present invention. 

Preferred poiyadenylation sequences for filamentous f&figal host ceils are obtained 
from the genes for Aspergillus oryzae TAKA amylase. Aspergillus ivger giueoarnyiase, 
s Aspergillus nidulam anthmnilate synthase, "Fusanum. oxysporum trypsin-iike protease, and 
Aspergillus nig&r a!pha»giucossdase. 

Useful polyadenylation sequences for yeast host ceils are described by Quo and 
Sherman, 1385, Molecular Cellular Biology 15: 5983-5990. 

The control sequence may also be a signal peptide coding region thai codes for an 
10 amino acid sequence linked to the amino terminus of a polypeptide and directs the encoded 
polypeptide into the cell's secretory pathway. The 5' end of the coding sequence of the nucleic 
acid sequence may inherently contain a signal peptide coding region naturally linked in 
translation reading frame with the segment of the coding region which encodes the secreted 
polypeptide. Alternatively, the 5' end of the coding sequence may contain a signal peptide 
16 coding region which is foreign to the coding sequence. The foreign signal peptide coding 
region may be required where the coding sequence does not naturally contain a signal peptide 
coding region. Alternatively, the foreign signal peptide coding region may simply replace the 
natural signal peptide coding region In order to enhance secretion of the polypeptide. However, 
any signal peptide coding region which directs the expressed polypeptide info the secretory 
20 pathway of a host cell of choice may be used in the present. Invention. 

Effective signal peptide coding regions for bacterial host cells are the signal peptide 
coding regions obtained from the genes for Bacillus NCIS 1 1 83? mailogenle amylase, Sse/fe 
stearsthemiophiius aiphs-amylase, Baoftim iichaniformis subtilisin, Bacillus Hchenifomvs 
alpha-amylase, Bacillus sieamilmrMaphiius neutral proteases {nprT. nprS, nprM), and Bacillus 
25 subtiiis prsA Further signal peptides are described by SImonen and Palva, 1393, 
Microbiological Reviews 57: 109-137. 

Effective signal peptide coding regions for filamentous fungal host cells are the signal 
peptide coding regions obtained from the genes for Aspergillus oryzae TAKA amylase, 
Aspergillus nigar neutral amylase, Aspergillus alg&r gluooamylase, RbizomUGormieh&i aspartic 
30 proteinase, Humlcola msQlem^ui&m^mydHumkmia lanuginosa lipase. 

Useful signal peptides for yeast host ceils are obtained from the genes for 
Saccharoroyms cmrmrisma alpha-factor and SamharamyGm ceravisiaa invertase. Other useful 
signal peptide coding regions are described by Romanes at aL, 1892, supra. 

The control sequence may aiso be a propeptide coding region that codes for an amino 
35 acid sequence positioned at the amide fermihPs of a polypeptide. The resultant polypeptide is 
known as a proenzyme or propolypeptide (or a zymogen in some cases). A prapolypepticte is 
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generally inactive and can be converted to a mature active polypeptide by catalytic or 
autocaiafytic cleavage of the propepfl<le:lft»fri&6. pro^^j^f$d©..The propeptide coding region 
may be obtained from the genes for Bacillus subtsHs alkaline protease (apr£), Bacillus subtiiis 
neutral protease (nprT), Samhammycm c&rmisim alpha-factor, Rhizomuoof miefrei aspartie 
5 proteinase, and Myceihphthom ih&rmcpMia immse (WO 85/33838). 

Where both signal peptide and propeptide regions are present at the amino terminus of 
a polypeptide, the propeptide region is positioned next to the amino terminus of a polypeptide 
and the signal peptide region is positioned next to the amino terminus of the propeptide region. 



10 expression of the polypeptide relative to the growth of the host cell. Examples of regulatory 
systems are those which cause the expression of the gens to be turned on or off in response 
to a chemical or physical stimulus, including the presence of a regulatory compound. 
Regulatory systems in prokaryotic systems include the lac, tac, and trp operator systems, in 
yeast, the ADH2 system or GAL1 system may he used, in filamentous fungi, the TAKA alpha- 
is amylase promoter, Aspergillus niger §iuooamylase promoter, and Asp&r^iiius myzm 
glucoamylase promoter may be used as regulatory sequences. Other examples of regulatory 
sequences are those which allow for gene amplification. In eukaryolic system**, these include 
the dlhydrofolate reductase gene which Is amplified in the presence of methotrexate, and the 
metallothionein genes which are amplified with heavy metals, in these cases, the nucleic acid 
20 sequence encoding the polypeptide would foe operabiy linked with the regulatory sequence. 



The present invention also relates to recombinant expression vectors comprising a 
nucleic acid sequence of the present invention, a promoter, and transcriptional and 

25 tranaiationai stop signals. The various nucleic acid and control sequences described above 
may be joined together to produce a recombinant expression vector which may Include one or 
more convenient restriction sites to allow for insertion or substitution of the nucleic acid 
sequence encoding the polypeptide at such sites. Alternatively, the nucleic acid sequence of 
the present invention may be expressed by inserting the nucleic acid sequence or a nucleic 

30 acid construct comprising the sequence into an appropnafe vector for expression, in creating 
the expression vector, the coding sequence is located In the vector so that the coding 
sequence is operably linked with the appropnafe eontroS sequences for expression. 

The recombinant expression vector may be any vector (&<g^ a piasmid or virus) which 
can be conveniently subjected to recombinant DMA procedures and can bring about the 

35 expression of the nucleic acid sequence, The choice of the vector will typically depend on the 



It may also be desirable to add regulatory sequences which allow the regulation of the 
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compatibility of the vector with the Host cell Into which the vector is to fee introduced. The 
vectors may be linear or closed circular f&ssmfds.. 

The vector may be an autonomoo^y repllsating vector, /.e„ a vector which exists as an 
exiraehromosomai entity, the -t^icMwri ;pf ytfifefe ind&pshiesnl of chromosomal replication, 
§ e.g., a pfasmid, an exfraehromosomai element a minichromosome, or an artificial 
chromosome. The vector may contain any means for assuring self-replication, Alternatively, 
the vector may he one which, when introduced Into the host cell, is Integrated Into the genome 
and replicated together with the chfomosome(s) into which it has been Integrated. 
Furthermore, a single vector or pfasmid or two or more vectors or plasmids which together 
10 contain the total DMA to be introduced into the genome of the host ceil, or a transposon may 
he used. 

The vectors of the present invention preferably contain one or more selectable markers 
which permit easy selection of transformed ceils, A selectable marker is a gene the product of 
which provides for blodde or viral resistance, resistance to heavy metals, prototrophy to 

15 autotrophs, and the like. Examples of bacterial selectable markers are the dai genes from 
Baaiilus mbtiiis or 8«ro?/te IkMmfomvs, Suitable markers for yeast host ceils are APE2, H1S3, 
LEU2, LYS2, MET3. TRF1, and URA3. Selectable markers for use in a : filamentous fungal host 
ceil include, but are not limited to, amdS (aeetamkiase), &rgB (ornithine earbamoyltransferasej, 
bar (phosphlnothhcin acetyltransferase), hygB (hygromycin phosphotransferase), maP {nitrate 

20 reductase), pyrG (orctidineS' -phosphate decarboxylase), aC (sulfate adenyitransferase), irpG 
(anthraniiafe synthase), as well as equivalents thereof. Preferred for use in an Aspergillus cell 
are the &md$ and pyrG genes of Aspergillus niduians or Aspergillus oryzae and the .oar gene 
of Stmptomycas hygroscapicus. 

The vectors of the present invention preferably contain an elements) that permits 

2§ stable integration of the vector info the host cell's genome or autonomous replication of the 
vector in the ceil independent of the genome. 

For integration Into the host cell genome, the vector may rely on the nucleic add 
sequence encoding the polypeptide or any other element of the vector for stable integration of 
the vector into the genome by homologous or nonhomologous recombination. Alternatively, the 

3D vector may contain additional nucleic acid sequences for directing integration by homologous 
recombination into the genome of the host cell. The additional nucleic acid sequences enable 
the vector to be Integrated into the host ceil "genome at a precise beation(s) in the 
chromosome(s). To increase the likelihood of Integration at a precise location, the integrationai 
elements should preferably contain a sufficient number of nucleic acids, such as 100 to 1,500 

SS base pairs, preferably 400 to 1,800 base pate, and most preferably 800 to 1500 base pairs, 
which are highly homologous with the corresponding target sequence to enhance the 

22 
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probability of homologous recombination. The integration al elements may be any sequence 
that Is homologous with the target sequence In the genome of fee host cell. Furthermore, the 
integrations! elements may be non-eneoding or encoding nucleic acid sequences. On the other 
hand, the vector may be integrated into the genome of the hast cell by non-homologous 
5 recombination. 

For autonomous replication, the vector may further comprise an origin of replication 
enabling the vector to replicate autonomously in the 'host cell in question. Examples of 
bacterial origins of replication are the origins of replication of piasmids pBR322, pUCIS, 
pACYC1?7, and pACYC184 perm&tfc® replication in £ colL and pUBHO, p€1S4 s pTAIOSQ, 

10 and pAfVlSI permitting replication in Bacilkm. Examples of origins of replication for use in a 
yeast host ceil are the 2 micron origin of replication. ARS1, ARS4, the combination of ARS1 
and CEN3, and the combination of ARS4 and CEN8. The origin of replication may be one 
having a mutation which makes It functioning temperature-sensitive in the host ceil (see, e.g., 
Ehrisch, 1 878, Pmsmding& of the N&tkmal Academy of Sciences USA 75: 1 433}, 

15 More than one copy of a nucleic acid sequence of the present invention may be 

inserted into the host cell to Increase production of the gene product. An Increase in the copy 
number of the nucleic acid sequence can be obtained by Integrating at least one additional 
copy of the sequence into the host cell genome or by including an ampiitiable selectable 
marker gene with the nucleic add sequence where cells containing amplified copies of the 

20 selectable marker gene, and thereby additional copies of the nucleic acid sequence, can tee 
selected for by cultivating the ceils in the presence of the appropriate selectable agent, 

The procedures used to ligate the elements described above to construct the 
recombinant expression vectors of the present invention are well known to one skilled In the art 
(see, e.g., Sambrook eta/.. 1889, supra), 

23 The protease may also be co-expressed together with at least one other enzyme of 

interest for animal feed, such as phyiaae (EC 3.13.8 or 3.1.3.26}; xylanase (EC 3.2.1.8); 
gaiaeianase (EC 3.2.189); alpha-galactosldase (EC 3.2,1,22}; protease (EC 3.4,-,-), 
phosphollpase A1 (EC 3.1132); phosphoiipase A2 (EG 3.11.4); lysophospboilpasa (EC 
3.115); phospholipase C (3,14.3); phospholipase D (EC 3,1.4.4); and/or beta-gfucanase (EC 

30 3.2.14 Of EC 3.2.1.8). 

The enzymes may be co-expressed from different vectors, from one vector, or using a 
mixture of both techniques, When using driferent vectors, the vectors may have different 
selectable markers, and different origins of replication. When using only one vector, the genes 
can be expressed from one or more promoters. If cloned under the regulation of one promoter 

35 (df- or muitlcistronic) , f he order in which the genes are cloned may affect the expression levels 
of the proteins. The protease may also be esopressed as a fusion protein, i.e. that the gene 
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encoding the protease has been fused in fame to the gene encoding another protein. This 
protein may be another enzyme or a functional domain from another enzyme. 

Accordingly, the invention alec - relates to a recombinant expression vector or 
polynucleotide construct comprising a jx^^€s^^ : ^'^;^y«fi|8«»3t, 

5 

Host Cells 

The present invention also relates to mcomhinant host cells, comprising a nucleic acid 
sequence of the invention, which are advantageously used in the recombinant production of 
the polypeptides. A vector comprising a nucleic acid sequence of the present invention is 

10 Introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a 
self-repilcating extra-chromosomal vector as described earlier. The term "host cell" 
encompasses any progeny of a parent cell that is not identical to the parent ceil due to 
mutations that occur during replication. The choice of a host ceii will to a large extent depend 
upon the gene encoding the polypeptide and its source. The host cell may be a unicellular 

15 microorganism, e.g., a prokaryote, or a non-unicellular microorganism, e.g., a eukaryofe. 

Useful unicellular ceils are bacterial ceils such as gram positive bacteria Including, but 
not limited to, a Bacillus ceil, or a Sireatomycas cell, or cells of lactic acid bacteria; or gram 
negative bacteria: such as E. co# and Pseudomonaa sp. Lactic add bacteria include, but are 
not limited to, species of the genera Lactoooccus, Lactobacillus, Lauconpst&c, Simpiooooom, 

20 Pediocpccm, m4 Entarococcua, Useful unicellular cells are bacterial cells such as gram 
positive bacteria including, but not limited to, a Bacillus cell, e.gv, Bacillus alkalopliliua, Bamfim 
amyioiiqu&faciem, Bacillus brews, Bacillus circiilaas, Bacillus ciausii, 8®0w ooBgulam, 
Bacillus lautus, Bacillus tentus, Bacillus Ihh&nifbrmls, Bacillus megaterium, Bacillus 
stmmthafmophiius, Bacillus subtili.% and Bacillus ihuriogiarisls; or a Btepiomyces ceil, e.g., 

25 Simptomyem livklms or Stmptpmyms murims, or gram negative bacteria such as E. mil and 
Pseudomonas sp. In a preferred embodiment, the bacterial host ceii is a Bacillus tmtm. 
Bacillus Schm&ormts, Bacillus stearothennophilus or BatiUm subtilis cell. In another preferred 
embodiment, the Bacillus cell is an aikalophic £sc$os. 

The Introduction of a vector Into a bacterial host cell may. for Instance, be effected by 

30 protoplast transformation (see, e,g, : Chang and Cohen, 1979, Molecular General Gen&tlos 
188: 111-115), using competent cells (see, ag,, Young and Spizkin, 1961, Journal of 
Bacteriology 81; 823-829, or Dufenan and PavIdofMbelson, 1971, Journal of Molecular 
Biology SB: 209-221), eiectroporation (see, e.g., Shlgekawa and Dower, 1988, Bioiechniques 
8: 742-751 }« or conjugation {see, e.gv : Koehter and Thome, 1987, Journal of Bacteriology 189; 

35 §771-5278). The host cell may be a eukaryote, sucli as a non-human animal ceii, an insect 
sell, a plant cell, or a fungal cell. In one particular embodiment, the host Cell is a fungal cell. 
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TungP as used herein includes the phyla Ascomyceta. Basldiomycota, Chytridiomyeoia, and 
Zygomycete, (as defined by Hawksworth ;&t al,, to, Ainmorih md Bisbfs Dictionary of The 
Fungi, 8th edition, 1995, CAB international, University Press, Cambridge, UK) as well as the 
Oomycota (as cited in Hswksworth of aL 1995, mpta, page, 171) and all mfiosporie fungi 
S (Hawksworth m at , 1 985, supra). 

in another particular embodiment, the fungal host oeii Is a yeast cell "Yeast" as used 
herein includes ascosporogenous yeast {Endomyeetales), feasidiosporogenous yeast, and 
yeast belonging to the Fungi Imperfect! (Blastomycetss). Since the classification of yeast may 
change in the future, for the purposes of this Invention, yeast shall he defined as described In 

10 Biology and Activities of Yeast {Skinner, FA> Passmore, $M<, and Davenport, R.R, ads, Soc*. 
App. Bacterid, Symposium Sams No. 9, 1980). 

The yeast host cell may be a Candida, Hanssnuia. Kluyveromyem, ,PkM&> 
Saecharomyc&s, Schizosacoharoaiyees, or farrowia cell. 

The fungal host cell may be a filamentous fungal cell "Filamentous fungi* include ail 

15 filamentous forms of the subdivision Eumyeota and Oomycota (as defined by Hawksworth ef 
a/., ipSj sdpm). The filamentous fungi are characterised by a mycelial wall composed of 
cbitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative 
growth Is by hyphal elongation and carbon oataboiism Is obilgaiely aerobic. In contrast, 
vegetative growth by yeasts such as SatxtharomycBS cerevimm is by budding of a unicellular 

20 tballus and carbon eataboilsm may be fermentative, 

Examples of filamentous fungal host cells are cells of species of, but not limited to, 
Aoiemonium, Aspergillus-. Fdaaiium. Humlool0, Mumf, Myceiiophihora, Ne-umspom, 
P&niGiHium, Thtel$v>$, Tdypodadium, or Trichoderma. 

Fungal cells may be transformed by a process Involving protoplast formation, 

26 transformation of the protoplasts, and regeneration of the cell wall in a manner known par so, 
Suitable procedures for transformation of Aspergillus host celis are described in EP 238 023 
and Yefon or at, 1984, Proceedings of the National Academy of Sciences USA 81: 1470- 
1474. Suitable methods for transforming Fusmum species are described by Malardier at ai... 
1989, Gene 78: 147-156 and WO 98/00787, Yeast may he transformed using the procedures 

30 described by Seeker and Guarente, ir? Abelson, JM. and Simon, MX, editors, Guide to Yeast 
Qen&iics and Molecular Biology,. 'M0fa&.in ■&it$mdaf& ! , Volume 104, pp 182-187, Academic 
Press, inc.. New York; Sto et at, 1983, Journal ' of Bacteriology 153: 183; and Hinnen at a/-, 
1078 , Pmceed/ngs of the National Academy of Sciences USA 75: 1920, 
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The Invention relates to a recombinant host eel! comprising a polynucleotide of the 
invention, or m expression vector or polynucleotide construct of the invention. in a preferred 
embodiment, the recombinant host ceil is a B&cHim ceil- 

5 Plants 

The pr«serit invention also relates to a traf!Sg@rtSc plant, plant part, or plant eel! which 
has been transformed with a nucleic $ei<l sequence encoding a polypeptide having protease 
activity of the present invention so as to express and produce the polypeptide Irs recoverable 
quantities. The polypeptide may be recovered from the plant or plant part. Alternatively; the 

10 plant or plant part containing the recombinant polypeptide may be used as suoh for Improving 
the quality of a food or feed, e.g., improving nutritional value, palatabliify, and theological 
properties, or to destroy an antinutritive factor. 

in a particular embodiment, the polypeptide is targeted to the endosperm storage 
vacuoles in seeds. This can be obtained by synthesizing it as a precursor with a suitable signal 

15 peptide, see Horvath et ai in PMAS, Feb. 15, 2000, vol. 97, no. 4, p. 1914-1919. 

The transgenic plant can be dicotyledonous (a djcot) or roonocotyledphous (a 
rnonocot) or engineered variants thereof. Examples of monocot plants are grasses; such as 
meadow grass (blue grass, Pea), forage grass such as fsstoos, loiium. temperate grass, such 
as Agrostis, and cereals, wheat, oats, rye, barley, rice, sorghum, and maize (corn); 

20 Examples "of dicot plants are tobacco, iegumes, such as lupins, potato, sugar beet, pea, bean 
and soybean, and cruciferous plants (family Srasslcaeeae), such as cauliflower, rape seed, 
and the closely related model organism Ambklopsls ihaliana, Lovv-phytate plants as described 
e.g. in US patent no, 5*869,054 and US patent no. 8,111,188 are examples of engineered 
plants. 

25 Examples of plant pads are stem, callus, leaves, roof, fruits, seeds, and tubers. Also 

specific plant tissues, such as chicroplsst, apoplast, mitochondria, vacuole, peroxisomes, and 
cytoplasm are considered to be a plant past Furthermore,, any plant ceil, whatever the tissue 
origin, Is considered to be a plant part 

Also Included within the scope of the present invention are the progeny of such plants, 

30 plant parts and plant cells. 

The transgenic piant or plant cell expressing a polypeptide of the present Invention 
may be constructed in accordance with methods known in the art. Briefly, the plant or plant cell 
is constructed by incorporating one or more expression constructs encoding a polypeptide of 
the present Invention into the plant host genome and propagating the resulting modified plant 

3S or plant ceil into a transgenic plant or plant cell. 
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Conveniently, the expression construct is a nucleic acid construct which comprises a 
nucleic acid sequence encoding a polypeptide of the present invention operabiy linked with 
appropriate regulatory sequences required for expression of the nucleic acid sequence In the 
plant or plant part of choice. Furthefraore, the expression construct may comprise a selectable 
s marker useful for identifying host ceils info which the expression construct has been integrated 
and DMA sequences necessary for sntrsdyctloFi of the construct into the plant in question (the 
latter depends on the DMA introduction method to he used). 

The choice of regulatory sequences, such as promoter and terminator sequences and 
optionally signal or transit sequences are determined , for example, on the basis of when, 

10 where, and how the polypeptide is desired to be expressed. For instance, the expression of the 
gene encoding a polypeptide of the present invention may he constitutive or inducible, or may 
be developmental, stage or tissue specific, and the gene product may be targeted to a specific 
tissue or plant part such as seeds or leaves. Regulatory sequences are, for example, 
described byTague of at, 1988, Plant Physiology 88: 5G6. 

15 For constitutive expression, the 3$S~CaMV promoter may be used (Franck .«£ a/,, 

1880, Call 21: 285-294). Organ-specific promoters may be, for example, a promoter from 
storage Sink tissues such as seeds, potato tubers, and fruits (Edwards & Coruszl, 1990, Ami 
Rw. Qwwt; 24: 275*303), or from metabolic sink tissues such as meristems (lb ef at, 1904, 
Piani MoI BigLM; 883-878), a seed specific promoter such as the giutelin, proiarrtin, gqbulin, 

20 or albumin promoter from rice CWu et at 1998, Plant and Ce// Physiology 39: 888-889), a Viola 
faba promoter from the legumirs 84 and the unknown seed protein gene from Viola fata 
(Conrad e^ .at, 1998, Journal of Plant Physiology 152: 708-711), a promoter from a seed oil 
body protein (Chen at al, 1998, Plant and Cell Physiology 39: 938-941}, the storage protein 
napA promoter from Bmssica napm, or any other seed specific promoter known In the art, 

28 e.g. , as described in WO 91/14772. Furthermore, the promoter may be a leaf specific promoter 
such as the rbes promoter from rice or tomato (Kyozuka at at, 1993, Plant Physiology 102: 
991-1000, the chlorelSa virus adenine raeihyitransferasa gene promoter pifra and Higgins, 

1994, Piani: Molecular Biology 28: 85-93), or the aidP gene promoter from rice {Kagaya ef at, 

1995, Molecular and Q&n&ral Qanatm 248; 888-574). or a wound inducible promoter such as 
30 the potato pin2 promoter (Xu ef at, 1993, Plant ' Molecular Biology 22; 573-588), 

A promoter enhancer element may also be used to achieve higher expression of the 
protease in the plant. For instance, the prornoter enhancer element may be an fntron which is 
placed between the promoter and the nucleotide sequence encoding a polypeptide of the 
present Invention, For instance, Xu ei a/., 1993, supra disclose the use of the first fntron of the 
38 hoe actio 1 gene to enhance expression. 
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Still further, the eodon usage may be optimised for the plant species in question to 
im prove expression (see Horvath elal referred to above). 

The selectable marker gene and any other parts of the expression construct may be 
chosen from those available In the art, 
S The nucleic add construct is incorporated Into the plant genome according to 

conventional techniques known in the art, smsluding Agmbmzimm>me6imsd transformation, 
virus-mediated transformation, niicroinjecfkm, particle bombardment, bioMstlc transformation, 
and electroporafion (Gasset et ah, 1990, Sofewee 244: 1293; Potrykus, 1990, Bio/Technology 
8; 535; Shiroamot© et Bit., 1989, nMore 338: 274). 

1 0 Presently, Agrobstctefium timwfatiem-mstiMed gene transfer is the method of choice 

for generating transgenic dfcote (for a review, see Hooykas and Sehilparoort, 1992, Pl&rti 
Molecular Biology 19; 15-38). However it can also be used for transforming monocots,. 
although other transformation methods are generally preferred for these plants. Presently, the 
method of choice for generating transgenic monocots is panicle bombardment {microscopic 

16 gold or tungsten particles coated with the transforming DMA) of embryonic call! or developing 
embryos (Ghrlstou, 1992, Plant Journal 2: 276-281; Shlmamoto, 1994, Gwmni Qmmn 
BjotBchnobgy 5: 158-162; Vasil el a/,, 1992, Bla/T&cbftolagy 10: 667-874}. An alternative 
method for transformation of monocots is baaed on protoplast transformation as described by 
Omiruileh at al , 1993, &attt Molecular Biology 21: 418-428. 

20 Following transformation., the transformants having incorporated therein the 

expression construct are selected and regenerated Into whole plants according to methods 
weihknown in the art. 

The present invention also relates to methods tor producing a polypeptide of the 
present invention comprising <a) cultivating a transgenic plant or a plant cell comprising a 
26 nucleic acid sequence encoding a polypeptide having protease activity of the present invention 
under conditions conducive for production of the polypeptide; and (b) recovering the 
polypeptide. The invention relates to a transgenic plant, or plant part, comprising a 
polynucleotide as defined in claim 8, or an expression vector or polynucleotide construct of the 
invention, 

30 

Animals 

The present Invention also relates to a transgenic, non-human animal and products or 
elements thereof, examples of which are body fluids such as milk and blood, organs, flesh, and 
animal cells. Techniques for expressing proteins, e\g. In mammalian cells, are known in the art, 
35 see e.g. the handbook Protein Expression: A Practical Approach, Biggins and Hamas (eds), 
Oxford University Press (1999), and the three other handbooks in this series relating to Gene 
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Transcription, RNA processing, and PosMransSafional Processing, Generally speaking, to 
prepare a transgenic animal, selected cells of a selected animal are transformed with a nucleic 
acid sequence encoding a polypeptide Mviog protease activity of the present invention so as 
to express and produce the polypeptide. The polypeptide may be recovered from the animal, 
S e.g. from the milk of female -animate, or the polypeptide may be expressed to the benefit of the 
animal itself, e.g. to assist the animafs digestion. Examples of animate are mentioned below In 
the section headed Animal Feed. 

To produce a transgenic animal with a view to recovering the protease from the milk of 
the animal, a gene encoding the protease may he inserted into the fertilized eggs of an animal 

10 In question, e.g. by use of a transgene expression vector which comprises a suitable milk 
protein promoter, and the gene encoding the protease. The transgene expression vector is 
micro! nyeeted into fertilised eggs, and preferably permanently Integrated into the chromosome. 
Once the egg begins to grow and divide, the potential embryo is implanted into a surrogate 
mother, and animals carrying the transgene are Identified. The resulting animal can then be 

15 multiplied by conventional breeding. The polypeptide may be purified from the animal's milk* 
see e# Meade, MM, atai (1988): Expression of recombinant proteins in the milk of transgenic 
animals. Gene expression systems; Using nature for the art of expression. J. M, Fernandez 
and J. P, Hoaffler (eds.)< Academic Press. 

In the alternative, in order to produce a transgenic non-human animal that carries In the 

20 genome of Its somatic and/or germ cells a miciaie acid sequence including a heterologous 
transgene construct Including a transgene encoding the protease, the transgene may foe 
operably Inked to a first regulatory sequence for salivary gland specific expression of the 
protease, as disclosed in WO 2000064247, 

The invention relates to a transgenic, non-human animal, or products, or elements 

25 thereof, comprising a polynucleotide, or an expression vector or polynucleotide construct of the 
invention. 

Methods of Production 

The present invention also relates to methods for producing a polypeptide of the 
30 present Invention comprising (a) cultivating a host cell or a transgenic plant or animal under 
conditions conducive for production of the polypeptide in a supernatant; and optionally (b) 
recovering the polypeptide. 

In the production meihafe-^tep^^tlnx^ori,, tie cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art, For 
35 example, the cell may be cultivated oy shake flask cultivation, small-scale or large-scale 
fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory 
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or Industrial fermeniors performed in a suitable medium' and under conditions allowing the 
polypeptide to be expressed an«f/or Isolated. The oultiVation takes place in a suitable nutrient 
medium comprising carbon and nitrogen sources and Inorganic salts, using procedures known 
in the art. Suitable media are available from commercial suppliers or may be prepared 
S according to published compositions (e.g., In catalogues of the American Type Culture 
Collection). If the polypeptide fs secreted Wo the nutrient medium, the polypeptide can be 
recovered directly from the medium. .If the polypeptide is not secreted, It can be recovered from 
ceil lysates. 

The polypeptides may be detected using methods known in the art that are specific for 
10 the polypeptides. These detection methods may Include use of specific antibodies, formation of 
a product, or disappearance of a substrate. For example, a protease -assay may be used to 
determine the activity of the polypeptide as described herein. 

The resulting polypeptide may be recovered fey. methods known In the art. For example, 
the polypeptide may be recovered from the nutrient medium by conventional procedures 
15 including, but not limited to, eentrif ligation, filtration, extraction, spray-drying, evaporation, or 
precipitation. 

The polypeptides of the present Invention may be purified by a variety of procedures 
known in the art including, but not limited to, chromatography («,#;, Ion exchange, affinity, 
hydrophobic, ebroroatofocusing, and stee exclusion), eiactrophdretic procedures fag,. 
20 preparative Isoelectric focusing), differential solubility (e.§:., ammonium sulfate precipitation), 
SOS-PAGB, or extraction (see, e.g., ProtBkrPtmfioatkm. J.-C. Janson and Lars Ryden, editors, 
¥<M Publishers, Hew York, 1989). 

Co mpositions 

25 In a still further aspect, the present invention relates to compositions comprising a 

polypeptide of the present Invention. The polypeptide compositions may be prepared in 
accordance with methods known in the art and may be in the form of a liquid or a dry 
composition. For instance, the polypeptide composition may be in the form of a granulate or a 
microgranulate. The polypeptide to be included In the composition may be stabilised In 

30 accordance with methods known in the est Examples are given below of preferred uses of the 
polypeptides or polypeptide compositions of the invention. 

Animal Feed 

The present Invention Is also directed to methods for using the polypeptides of the 
35 invention in animal feed, as wet! as to feed compositions and feed additives comprising the 
polypeptides of the invention. The term animal includes all animals. Including human beings. 
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Examples of animals are nonHruminantSj and ruminants, such as cows, sheep and horses In a 
particular embodiment, the animal! is a non-rumlnant animal. Non-ruminant animals include 
mono-gastric animals, e.g. pigs or swine (including, but not limited to, piglets, growing pigs, 
and sows); poultry such as turkeys, ducks and chicken (including but not limited to broiler 
5 chicks, layers): young calves; and fish (including but not limited to salmon, trout, tilapia, catfish 
and carps; and crustaceans (including but not limited to shrimps and prawns) 

The term feed or feed composition means any compound, preparation, mixture, or 
composition suitable for, or intended for intake by an animal. 

In the use according to the invention the protease ©an be fed to the animal before, 

10 after, or simultaneously with the diet. The latter is preferred. 

In a particular embodiment, the protease, in the form in which it Is added to the feed, or 
when being Included In a feed additive, is well-defined. Well-defined means that the protease 
preparation is at least 50% pure as determined by Size-exclusion chromatography (see 
Example 12 of WO 01/S827S). In other particular embodiments the protease preparation Is at 

15 least 60, 70, 80, 85, 88, 00, 02, 04, or at least 05% pure as determined by this method, A well- 
defined protease preparation is advantageous. For instance, it is much easier to dose correctly 
to the feed a protease that is essentially free from interfering or contaminating other prof eases. 
The term dose correctly refers in particular to the objective of obtaining consistent and 
constant results, and the capability of optimising dosage based upon the desired effect, 

20 For the use in animal feed, however, the protease need not bp that pure; it may e g. 

Include ether enzymes, in which case it could be termed 'a protease preparation.. The protease 
preparation can be (a) added directly to the feed (or used directly in a treatment process of 
vegetable proteins), or (b) it can be used in the production of one or mens intermediate 
compositions such as feed additives or premises that is subsequently added to the feed (or 

2§ used in a treatment process). The degree of purity described above refers to the purity of the 
original protease preparation, whether used according to (a) or (b) above. 

Protease preparations with purifies of this order of magnitude are in particular 
obtainable using recombinant methods of production, whereas they are not so easily obtained 
and also subject to a much higher haieh~fo-bateh variation when the protease Is produced by 

30 traditional fermentation methods. Such protease preparation may of course be mixed with 
other enzymes, 

in a particular embodiment, the protease for use according to the invention is capable 
of solubising vegetable proteins. A suitable assay for determining solubised protein is 
disclosed in Example 1 1 . 

35 The term vegetable proteins as used herein refers to any compound, composition, 

preparation or mixture. ifeai includes M- least : dne.--p«!|^r» dedved from or originating from a 
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vegetable, Including modified proteins and cfoteln^envatives. In particular embodiments, the 
protein content of the vegetable proteins is at least 1.0,: 20, 30, 40, SO, or 60% (w/w).. Vegetable 
proteins may be derived from vegetable protein sources, such as legumes and cereals, for 
example materials from plants of the families Fatmem {Logummosaa}, Crucifaraceas, 
s Chenopodiamae, and Poao&m,. subhas soy bean meal, lupin meal and rapeseed meal. In a 
particular embodiment, the vegetable protein source Is material from one or more plants of the 
family f afeaeeae, e.g. soybean, lupine, pea, or bean. 

In another particular embodiment, the vegetable protein source is material from one or 
more plants of the family C/?enopodiaeea© ? e.g. beet, sugar beet, spinach or guinea. Other 

10 examples of vegetable protein sources are rapeseed, and cabbage. Soybean is a preferred 
vegetable protein source. Other examples of vegetable protein sources are cereals such as 
barley, wheat, rye, oat, maize (corn), rice, and sorghum. 

The treatment according to the invention of vegetable proteins with at least one 
protease of the Invention results in an increased solubilization of vegetable proteins. The 

15 following are examples of % seiubilised protein obtainable using the proteases of the Invention 
in a monogastrlc m vitro model; At least 102%, 103%, 104%, 106%, 108%, or at feast 107%, 
relative to a blank. The percentage of solubiilsed protein is determined using the monogastric 
in vitro model of Example 1 1 . The term soiubilisafion of proteins basically means bringing 
proteins) into Solution. Such soiubillsatlon may be due to proteasa-mediatad release of protein 

£0 from other components of the usually complex natural compositions such as feed. 

In a further particular embodiment, the protease for use according to the invention is 
capable of increasing the amount of digestible vegetable proteins. The following are examples 
of % digested or digestible protein obtainable using the proteases of the invention m a 
monogastric m vitro model: At least 104%. 105%, 108%, "107%, 108%, 108%, or at least 

25 110%, relative to a blank. The percentage of digested or digestible protein is determined using 
the in vitro model of Example 1 1 

The following are examples of % digested or digestible protein obtainable using the 
proteases of the invention in an aquaculfure in vito model: At least 103%, 104%, 106%, 108%, 
107%, 108%, 109% or at least 110%, relative to a blank. The percentage of digested or 

30 digestible protein is determined using the aquaculfure in vitro model of Example 12, 

In a still further particular embedment, the protease for use according to the invention 
is capable of Increasing the Degree of Hydrolysis (fOH) of vegetable proteins. The following are 
examples of Degree of Hydrolysis increase obtainable In a monogastric in vim model: At least 
102%, 103%, 104%, 106%, 108%, or at least 107%, relative to a blank. The DM Is determined 

35 using the monogastric tin vitm model of Example 1 1. The following are examples of Degree of 
Hydrolysis increase obtainable lb an aqoacuifure in vitm model: At least 10.2%, 103%, 104%, 
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106%, 108%, or at feast 1 07%, relative to a blank. The DH is determined using the aquaculiure 
in vitro model of Example 12, 

In a particular embodiment of a {pre-} treatment process of the invention, the 
protease(s) in question is affecting (or acting on, pi exerting Its solubilising Influence on) the 
5 vegetable proteins or protein sources- To achieve this, the vegetable protein or protein source 
is typically suspended in a solvent, e.g. an aqaeous solvent such as water, and the pH and 
temperature values are adjusted paying due regard to the characteristics of the enzyme in 
question. For example., the treatment may take place at a pH-vakse at which the activity of the 
actual protease is at least at least 40%, 50%, 80%, 70%, 80% or at least 90%. Likewise, for 

10 example, the treatment may take place at a temperature at which the activity of the actual 
protease is at least 40%, 50%, 60%, 70%, 80% or at least 00%. The above percentage activity 
indications are relative to the maximum activities. The enzymatic reaction Is continued until the 
desired result is achieved, following which if may or may not be stopped by inactivating the 
enzyme, e.g, by a heat-treatment step. 

15 in another particular embodiment of a treatment process of the invention, the protease 

action Is sustained, meaning e g, that the protease- Is added to the- vegetable proteins or 
protein sources, but its solubilising Influence is so to speak not switched on until later when 
desired, once Suitable solubilising conditions are established, or once any enzyme inhibitors 
are Inactivated, or whatever other means could have been applied to postpone the action of 

20 the enzyme. 

in one embodiment the treatment Is a pre-treatment of animal feed or vegetable 
proteins for use in animal feed. 

The term improving the nutritional value of an animal feed means improving the 
availability and/or digestibility of the proteins, thereby leading to increased protein extraction 
25 from the diet components, higher protein yields, increased protein degradation and/or 
improved protein utilisation. The nutritional value of the feed is therefore increased, and the 
animal performance such as growth rate and/or weight gain and/or feed conversion ratio (i.e. 
the weight of Ingested feed relative to weight gain) of the animal is/are improved. 

In a particular embodiment the feed conversion ratio is Increased by at least 1%, 2%, 
30 3%, 4%, 5%, 8%, 7%, 8%, 9% or at least 10%. In a further particular embodiment the weight 
gain is Increased by at least 2%, 3%, 4%, 5%, 8%, 7%, 8%, 9%, 10% or at least 11%, These 
figures are relative to control experiments with no protease addition, 

The feed conversion ratio {FOR} and the weight gain may be calculated as described In 
EEC (1988): Directive de la Commission do 8 avrii 1988 flxant 3a methods da caicui de fa 
35 valour energetique des ailments composes destines a la volaille. Journal Officiel das 
Communautes Europeenoes, L130„ 53 — 54, 
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The protease can be added to the feed in any form, foe it. as a relatively pure protease, 
arm admixture with other o^pqri^its'-temled'' for addi^m- ia:a«imai feed, U. in the form of 
animal feed additives, such as tie;scH^e<Jp6Mi?ix9sfqr^m^,f8ed. 

in a further aspect the present invention relates to compositions for use m animal feed, 
5 such as animal feed, and animal feed additives, e.g. premixss. 

Apart from the protease of the invention, the animal feed additives of the invention 
contain at least one fat-soluble vitamin, and/or at least one water soluble vitamin, and/or at 
least one trace mineral. The feed additive may also contain at least one macro mineral. 

Further, optional, feed-additive ingredients are colouring agents, aroma compounds, 
10 stabilisers, antimicrobial peptides, including antifungal polypeptides, and/or at least one other 
enzyme selected from amongst phytase (EC 3,1,3.8 of 3.13.26); xyianaaa (EC 3,2:1,8); 
gaiseianase (EC 3,2.1.89): alpha-gaiactosidase (EC 3,2.1.22); protease (EC 3,4.-,-), 
phospnoilpase A1 (EC 3.1,1.32); phosphoispase A2 (EC 3,1.1,4): iysophosphoilpaae (EC 
3,1,1.5); phosphollpase C (3.14.3); phospholipase D (EC 3,14,4): and/or beta-giucanase (EC 
15 3.2.14 or EC 3.2.1.8), 

In. a particular embodiment these other enzymes are well-defined (as defined above for 
protease preparations), 

Examples of antimicrobial peptides (AMPs) are CAP18, Leueocin A, Tritrptidin, 
Pmtesfin-1, Thanatin, Defensin, Lactoferrin, Lactoferricin, and Ovispiiin such as Novlspsrsn 
20 (Robert iehrer, 2000), Plectasibs, and Statins, including the compounds and polypeptides 
disclosed in PC17DK02/00781 and PGT/DK02/00812, as well as variants or fragments of the 
above that retain antimicrobial activity. 

Examples of antifungal polypeptides (AFPs) are the Aspergfflm gigantem, and 
Aspsrgfflm rigor peptides, as well as variants and fragments thereof which retain antifungal 
25 activity, as disclosed in WO 94/014SS and WO 02/090384, 

Usaily fat- and water-soluble vitamins, as well as trace minerals form part of a so-called 
premix Intended for addition to the feed, whereas macro minerals are usually separately added 
to the feed, A premix enriched with a protease of the invention, is an example of an animal 
feed additive of the invention. 
30 in a particular embodiment, the animal feed additive of the invention is intended for 

being included (or prescribed as having to be included) in animal diets or feed at levels of 0.01 
to 10.0%; more particularly 0.05 to BJM; or 0>2 to 10% (% meaning g additive per 100 g 
feed). This Is so in particular for premixes. 

Tbe following are non-excruslve lists of examples of these components; 
35 Examples of fat-soluble vitamins are vitamin A, vitamin D3, vitamin E, and vitamin K, 

e.g. vitamin KS, 
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Examples of water-sofubl© vtamins we Pamirs B12, hiotin and choline, vitamin 81, 
vitamin 82, vitamin 86, niacin, folic acid and panfuoiheuate, Ca-D-panthofhenate, 

Examples of trace minerals am manganess, iinc 5 Iron, copper, iodine, selenium, and 

cobalt 

5 Examples of macro minerals are calcium, phosphorus and sodium. 

The nutritional requirements of these components {exemplified with poultry and 
piglets/pigs) are listed in Table A of WO ButrMonai requirement means that these 

components should be provided in the diet in the -concentrations indicated. 

In the alternative, the animal feed additive of the invention comprises at least one of the 
10 individual components specified in Table A of WO 01/58275. At least one means either of, one 
or more of, one, or two, or three, or four and so forth up to all thirteen, or up to ail fifteen 
Individual components. Mem specifically, this at least one Individual component Is included In 
the additive of the Invention in such an amount as to provide an in-feed~eoocentrafipn within 
the range indicated in column four, or column five, or column six of Table A. 
15 The present invention also relates to animal feed compositions. Animal feed 

compositions or diets have a relatively high content of protein, Poultry and pig diets dan be 
characterised as indicated in Table B of WO 01/58275, columns 2-3. Fish diets can be 
characterised as Indicated in column 4 of this Table B. Furthermore such •fish diets usually 
have a crude fat content of 200-310 g/kg, WO 01/58276 corresponds to US 09/778334 which 
20 is hsreby Incorporated by reference. 

An animal feed composition according to the invention has a crude protein content of 
50-800 g/kg, and furthermore comprises at least one protease as claimed herein. 

Furthermore, or in the alternative (to the crude protein content indicated above), the 
animal feed composition of the invention has a content of metaboiisable energy of 10-30 
25 MMq: and/or a content of -calcium of 0, 1 -200 g/kg: and/or a content of available phosphorus of 
0,1 -200 g/kg; and/or a content of methionine of 0,1-100 g/kg; and/or a content of methionine 
plus cysteine of 0,1 -150 g/kg; and/or a content of lysine of 0.5-50 g/kg. 

In particular embodiments, the content of mefaboiisahle energy, crude protein, calcium, 
phosphorus, methionine, methionine plus cysteine, and/or lysine is within any one of ranges 2, 
30 3, 4 or 5 in Table B of WO 01/58275 {R. 2-5). 

Crude protein is calculated as nitrogen {H) multiplied by a factor 8.25, i.e. Crude protein 
{g/kg}" N (g/kg) x §,25. The nitrogen content Is determined fay the Kjeldahi method (AO AC, 
1984, Official Methods of Analysis- 14th ed„ Association of Official Analytical Chemists, 
Washington DC), 

35 Metabollsabie energy can be :;i^l€ai|#ed:- : en :#^.1)^s:-i9f -the NRC publication Nutrient 

requirements in swine, ninth revised edition 1§88 t subcommittee on swine nutrition, committee 
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on animal nutrition, board of agriculture, Mgona! research council National Academy Press, 
Washington, D.C., pp. 2-8, a«d the European Table of Energy Values for Poultry Feed-stuffs, 
Spelderholt centre for poultry research and extension, 7381 DA Seekbergen, The NMherfands. 
Grafiscb bedrljf Ponsen & iooijen bv, Waganingen. ISBN 90-71463-1 2-S. 
§ The dietary content of sales urn* available phosphorus and amino acids In complete 

animal diets is calculated on the basis of feed tattles such as Veeyoedertabe! 1997, gegevens 
over ehemisehe samensteiiing, verteerbaarneld en voederwaarde van voederrnlddelan. 
Centra! Veevoederbyreau, Runderweg S, 8219 pk Leiystad, ISBN 90-72839-13-7, 

!ri a particular embodiment, the ammai feed composition of the invention contains at 

10 least, one vegetable protein or protein source as defined above, 

in still further particular embodiments, the animal feed composition of the- invention 
contains 0-80% maize; and/or 0-80% sorghum: and/or 0-70% wheat; and/or 0-70% Barley; 
and/or 0-30% oats; and/or 0-40% soybean meal; and/or 0-10% fish meal; ami/or 0-20% whey. 
Animal diets can e.g. be manufactured as mash feed (non pelleted) or pelleted feed. 

15 Typically, the milled feed-stuffs are mixed and sufficient amounts of essential vitamins and 
minerals are added according to the specifications for the species in question. Enzymes dan 
be added as solid or liquid enzyme formulations. For example, a solid enzyme formulation is 
typically added before or during the mixing step; and a liquid enzyme preparation is typically 
added after the pelleting step. The enzyme may also be incorporated in a feed additive or 

20 promix. 

The final enzyme concentration in the diet is within the range of 0,01-200 mg enzyme 
protein per kg diet, for example In the range of 0.6-26 mg enzyme protein per kg animal diet 

The protease should of course be applied in an effective amount, i.e. in ah amount 
adequate for improving soiubilisation and/or improving nutritional value of feed, it is at present 
25 contemplated that the enzyme is administered in one or more of the following amounts 
(dosage ranges): 0.01-200; 0.01-100; 0.5-100; 1-50: 5-100; 10-100; 0.05-50; or 0.10-10 - all 
these ranges being in mg protease enzyme protein per kg feed Cppm), 

For determining mg enzyme protein per kg feed, the protease is purified from the feed 
composition, and the specific activity of the purled protease is determined using a relevant 
30 assay (see under protease activity, substrates, and assays). The protease activity of the feed 
composition as such Is also determined using the same assay, and on the basis of these two 
determinations, the dosage in mg enzyme protein per kg feed is calculated. 

The same principles apply for determining mg enzyme protein in feed additives. Of 
course, if a sample is available of the protease used for preparing the feed additive or the feed, 
35 the specific activity Is determined from this sample {no need to purify the protease from the 
feed composition or the additive). 
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The present invention is further described by the following examples which should not 
be construed as limiting the scope of the irwanfioa 

I MjrsintQQffl^oM jons 

5 The protease of the invention may he added to and thus become a component of a 

detergent composition. The detergent composition of fee invention may for example be 
formulated as a hand or machine laundry detergent composition Including a laundry additive 
composition suitable for pre-treairramt of stained fabrics and a rinse added fabric softener 
composition, or be formulated as a detergent composiion for use in general household hard 

10 surface cleaning operations, or be formulated for hand or machine dishwashing operations. 

In a specific aspect, the invention provides a detergent additive comprising the 
protease of" the invention. The detergent additive as well as the detergent composition may 
comprise one or more other enzymes such as another protease, such, as alkaline proteases 
from Bacillus, a lipase, a cutlnase, an amylase, a carbohydrase, a cellulose, a peetlna.se, a 

is mannanase, an arabinasa, a galactanasa, a xyianase, an oxidase, e.g., a taeease, and/or a 
peroxidase. 

In general the properties of the chosen enzyme(s) should be compatible with the 
selected detergent, (I.e. pH-opUmum, compatibility with other enzymatic and non~enzymafic 
ingredients, etc,)* and the enzymeCs) should be present ineffective amounts, 

20 Suitable lipases include those of bacterial or fungal origin. Chemically modified or 

protein engineered mutants are included. Examples of useful lipases include lipases from 
Humicaia (synonym Thmrtomycm), e.g. from H. lanuginosa {7 imuginosm) as described in 
ip 258088 and EP 305218 or from R msoiens as described in WO 98/13580, a 
Pseudomoms lipase, e.g. from P. aicaiigenes or P. ps&ttdoalmligsnm (EP 218272), P. 

25 cepacia (EP 331378), P, stutz&ri (GB 1,372,034), P. shiore&cens, Psmsdomoms so, strain 3D 
705 {WO 95/08720 and WO 98/27002), P. wisGQmm&mis (WO 98/12912), a Bacillus lipase, 
eg. from 8. sub&s (Darfois et al, (1993), Biochemica et Biophysics Acta, 1131, 283-380), 8, 
siearoih&miophiias (JP 84/744992.) or &..pumilu$ (WO 91/18422), Other examples are lipase 
variants such as those described in WO 92/05249, WO 94/01541, EP 407225, EP 260105, 

30 WO 85/35381, WO 98/00292, WO 85/30744, WO 94/25578, WO 95/14783, WO 9S/22818, 
WO 97/04079 and WO 97/07202. Preferred commercially available lipase enzymes Include 
Llpolase™ and tipoiase Ultra™ (Hovozfrnes A/S). 

Suitable amylases (alpha- and/or beta*) include those of bacterial or fungal origin. 
Chemically modified or protein engineered mutants are Included, Amylases include, for 

35 example, aipha-amyiases obtained torn Bacite, eg. a special strain of 3, iichenifonnis, 
described in more detail in GB 1,298,839. Examples of useful amylases are the variants 
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described in WO 94/02597, WO 94/18314, WO 96723873, and WO 37/43424, especially the 
variants with substitutions in one of mors of the following pcsiifons: 15, 23, 105, 108, 124, 128, 
133, 154, 156, 181, 188, 190, 197, 202, 208, 209, 243, 264, 304, 305, 391, 408, and 444. 
Commercially available amylases are Dummyi™, T«frnaroyP { FungamyP and BAN™ 
5 (Novozymes A/S}, Rapldase 1 ^ and Purastar m, {from Qeneneor International Inc.). 

Suitable ceiluiases include those of bacterial or fungal origin. Chemically modified or 
protein engineered mutants are included. Suitable ceiluiases include ceiluiases from the 
genera Baalim, Pseua'omams, Humimia, Fmmiam, Thteiavia, Acreraomum, e.g. the fungal 
ceiluiases produced from HumtSQia insoiem, Myceiiophthora themophiia and Fmarkm 

10 oxysporum disclosed in US 4,435,307, OS 5,648,283, US 8,891178, US 5,776,757 and W0 
89/09259, Especially suitable ceiluiases are the alkaline or neutral celluloses having colour 
care benefits. Examples of such ceiluiases are ceiluiases described in EP 0 495257, EP 
531372, WO 96/11262, WO 90/29397, WO 98/08940. Other examples are cellules© variants 
such as those described in WO 94707998, EP 0 531 315, US 5,457,048, US 5,588,593, US. 

15 5,783,264, WO 95/24471 « WO 98/12307 and- WO 99/01 544. Commercially available ceiluiases 
include Ceiluzyme™, end Ceresyme™ (Novozymes A/S). Clazinese™, and Puradax HA™ 
{Genencbr International inc.), and KAC-500(B}™(Kao Corporation). 

Suitable peroxidases/oxidases include those of plant, bacterial or fungal origin. 
Chemically modified or protein engineered mutants are included. Examples of useful 

20 peroxidases include peroxidases from Coonhus, e.g. from C, cinemrn, and variants thereof as 
those described in WO 93/24618, WO 95710602, and WO 98/15257. Commercially available 
peroxidases include Ouardsyme^CNovo^mes). 

The detergent enzyme(s) may be Included in a detergent composition by adding 
separate additives containing one or more enzymes, or by adding a combined additive 

25 comprising all of these enzymes. A detergent additive of the Invention, i.e. a separate additive 
or a combined additive, can be formulated e.g, as a granulate, a liquid, a slurry, etc. Preferred 
detergent additive formulations are granulates, in particular non-dusting granulates, liquids, In 
particular stabilized liquids, or slurries. 

Non-dusting granulates may be produced, e.g., as disclosed in US 4,106,991 and 

3D 4,661,452 and may optionally be coated by methods known in the art. Examples of waxy 
coating materials are polyethylene oxide) products (polyethyleneglycol, PEG) with mean 
molar weights of 1000 to 20000: ethoxyiated nonylphenols having from 16 to SO ethylene oxide 
units; ethoxyiated fatty alcohols in which the alcohol contains from 12 to 20 carbon atoms and 
sn which there are 15 to 80 ethylene oxide units; fatty alcohols; fatty acids: and mono- and di- 

35 and triglycerides of fatty acids. Examples of f im4orming coating materials suitable for 
application by fluid bed techniques are given In <3B 1483591. liquid enzyme preparations may, 
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for instance, be stabilised by adding a poiyol such as propylene glycol, a sugar or sugar 
alcohol, lactic acid or boric acid according to estafellsried methods. Protected enzymes may be 
prepared according to the method disclosed in EP 238218. 

The detergent composition of the invention may he in any convenient form, e.g., a bar, 
5 a tablet, a powder, a granule, a paste or a liquid, A liquid detergent may be aqueous, typically 
containing up to 70 % water and 0-30 % organic solvent, or nonaqueous. 

The detergent composition comprises one or more surfactants, which may be non-ionic 
Including semi-polar and/or anionic and/or cafionlo and/or EwMerlonie. The surfactants are 
typically present at a level of from 0.1% to 80% by weight. 
10 When included therein the detergent will usually contain from about 1% to about 40% 

of an anionic surfactant such as linear aikylberuenesulforsaie, alpha-oiefinsulfonaie, aikyi 
sulfate (fatty alcohol sulfate), alcohol ethoxysulfate, secondary alkanesulfonate, afpha-suifo 
fatty acid methyl ester, aikyl- or eSkenylauccWc acid or soap. 

When included therein the detergent wl!) usually contain from about 0.2% to about 40% 
15 of a norvsonlc surfactant such as alcohol ethoxylate, nonylphenol efhoxylate, 
aiylpolygtycoside, aikyldiroethylamsneoxide, ethoxylated fatty acid monoethanoiamide, fatty 
acid monoethanolamide, pelyhydroxy alky! fatty acid amide, or N-aeyi N-atkyl derivatives of 
glucosamine ■fglucamides"). 

The detergent may contain 0-SS % ofs detergent builder or completing agent such as 
20 zeolite, diphosphate, triphosphate, phosphonate, carbonate, citrate, nitrilotriaeatfc acid, 
etbylenediaminetetraacetlo acid, diethyleneinaminepenlaacetie acid, aikyl- or aikehylsuselnrc 
add, soluble silicates or layered silicates (e.g. 8K8-S from Hoechst). 

The detergent may comprise one or more polymers. Examples are 
carboxymethylcellulose, polyvinylpyrrolidone}, poly (ethylene glycol), polyvinyl alcohol), 
25 poly(vinylpyhdine-N-oxide), poly(vinylimldaxoie), poiycarboxylatas such as polyacrylates, 
malelc/acn/llc acid copolymers and laoryl methacrylate/acryllc acid copolymers. 

The detergent may contain a bleaching system which may comprise a H ? 0 5! source 
such as perborate or perearbcnafe which may be combined with a peraoid-forming bleach 
activator such as totraacetylethylenediamfne or nonar^yioxyfcenxenesuifonate. Alternatively, 
30 the bleaching system may comprise peroxyacids of e.g. the amide, imlda, or sulfone type. 

The enzymeCs) of the detergent composition of the invention may be stabilized using 
conventional stabilizing agents, e.g., a poSyol such as propylene glycol or glycerol, a sugar or 
sugar alcohol, lactic acid, boric acid, or a boric acid derivative, e.g., an aromatic borate ester, 
or a phenyl boronic acid derivative such as 4-fomiylphenyl boronic acid, and the composition 
35 may be formulated as described in e.g. W© §2/1370§ and WO 9271 0708. 
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The detergent may also ooMsin other eonventlonai detergent ingredient such as e.g. 
fabric conditioners including clays, foam boasters, suds suppressors, anti-corrosion agents, 
soil-suspending agents, anti-soil redepastiian agents, dyes, bactericides, optical brlghteners, 
hydrotropes, tarnish Inhibitors, or perfumes. 
5 It is at present contemplated that in the detergent compositions any enzyme, In 

particular the enzyme of the invention* may he added in an amount corresponding to 0.01-100 
mg of enzyme protein per liter of wash ifqour, preferably 0.05-5 mg of enzyme- protein per liter 
of wash ilqour, in particular 0.1-1 mg of enzyme protein per liter of wash ilqour. 

The enzyme of the Invention may additionally he incorporated in the detergent 
10 formulations disclosed In WO 97/07202. 

The invention described and claimed herein is not to he limited in scope by the specific 
embodiments herein disclosed, since these embodiments are intended as illustrations of 
several aspects of the invention. Any equivalent embodiments are Intended to he within the 
scope of this Invention, indeed, various modifications of the invention in addition to those 
18 shown and described herein will become apparent to those skilled in the art from the foregoing 
description.. Such modifications ere also intended to fail within the scope of the appended 
claims. In the case of conflict, the present disclosure Including definitions will control. 

Various references are cited herein, the disclosures of which are incorporated by 
reference In their entireties. 

20 

EXAMPLES 
Materials and methods 
Strains:. 

Bacillus suttills PL 1801 (Dlderichsen, B et ai. 1990. Cloning of aldB, which encodes 
26 aiplva-acatoiaefate decarboxylase, an exoenzyme from Bmfflus brevis. J, Bacterial, 

172, 4315-4321} 
Bacillus subtills MB 1053 
Badilos mbffl$ PL3538-37 
BadliuB subfflis MB151S- 

30 BmslSus subfSis PL2.308. This strain Is the B.suMilss DN1885 with disrupted apr and 

npr genes (Dlderichsen, B.. Wedsted, IX, Hedegaard, L. Jensen, 8. R,, Sjeholm, C. 
(1S90) Cloning of aldB, which encodes alpha-aeetolaciate decarboxylase, an 
©coenzyme from B&Mm bmvls. -J. Bacterial, 172, 4315-4321) which Is also 
disrupted in the transchpSanal unit of the known Bacillus subtilis ceilulase gene, 

35 resulting In oeiluisse negative cells. The disruption was performed essentially as 
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described in (Eds. A.L Sonenshein, JA. Moch and Richard Loslck (1993) Bacillus 
subtiils and other Gram-Posmve Bacteria, American Society for microbiology, p.618). 

Procedure for isolating genomic DNA. 
§ Harvest 1 .6 mi culture and resuspend m 100 u! TEL Leave at 37C for 30 mm. 
Add 500 ui thiocynaie buffer and leave at room temperature for 10 mm 
Add 250 ui NH4Ac and leave at ice fdriO mln. 
Add 500 pi CIA and mix. 

Transfer to a microcentrifuge and spin for 10 min. at full spaed, 
10 Transfer supernatant to a new Eppendorf tube and add 0.54 volume cold isopropanoi. Mix 
thoroughly, 

Spin and wash the DMA pellet with 70 % EtOH. 
Rosy spend the genomic- DNA in 100 pi TER. 

IE: 1QmM7ris~HCI,pH?.4 

1 mM EDTA, pH 8.0 
TEL; 50 nig/ml lysozym isl TE-buffer 

Thiooyanale: SM guanidium thiocyanate 

lOOmMSDTA 

0,8 % w/v N-iauryisarcosine, sodium salt. 
80 s3 thiocyanate. 20 mi 0,5 M ED7A, pH 8,0, 20 mi H20 
dissolves at 85C. Cool down to RT and add 0,6 g N- 
laurylsarcosin©. Add H20 to 100 ml and filter it through a 0.2 p 



15 



NH4Ac: 7.5 M CH3COONH4 

TER: 1 pg/mi Rnase A in TE~buffer 

CIA: Chioroform/feoarnyf alcohol 24: 



Purification of PGR bancfe aod Pl^sesuenefna 

PGR fragment can be purified using GFX m PGR DNA and Gel Band™ Purification 
Kit (Pharmacia Biotech) according to the manufacturers instructions, The nucleotide 
sequences of the amplified PGR fragment are determined on an A8I PRISM™ 3700 DMA 
20 Analyser (Parkin Elmer, USA} using 50-100 ng as template, the Tag deoxy-terminal cycle 
sequencing kit (Parkin Elmer,- USA), fluorescent, labeled terminators and 5 pmol of the 
sequencing primer of choice. 
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Media 

TV: (As described in Ausubel, F. M. et al (eds.) !! CyrrsM protocols in yoleoular Biology". John 
Wiley and Sons, 1995), 

18 agar: (As described in Ausubel, R 41 at ah {eds ) "Current protocols in Molecular Biology". 
5 John Wiley and Sons, 1 995). 

LB-PG agar, is LB agar supplemented wHn 0.5% Glucose and 0.05 M potassium phosphate, 
pH 7.0. 

Pmfee Mica^M k 

10 82A protease activity is measured using the PHA assay with $ueolnyha!anln©-- 

giianine-proline-phenytainlne-paranitroaniiide as a substrate unless otherwise mention. The 
principle of the PNA essay is described in Rothgeb, T.M., Goodlander, B,D„ Garrison, P H., 
and Smith, LA, Journal of the American Oil Chemists' Society, Vol. 85 (5) pp. 806-810 (1988). 

IS gene, expressio n in Bmiikm subtflis host 

All the expressed genes In the fallowing examples are integrated by homologous 
recombination on trie £ac$«« subWts host cell genome. The genes are expressed Under the 
control Of a triple promoter system (as described in WO 99/43835), consisting of the promoters 
from Bacfflm tichmiformis alpha-amylase gene iamyt), Bacillus amyloiiquefaoiens alpha- 

•20 amylase: gene (smyQ), and the Bacillus thuringmmis cryiflA promoter including stabilizing 
sequence. The gene coding for Chloramphenicol acetyl-transferase was used as maker, 
(Described in eg. Dlderiehsen<B.; Poulsen.O.B.; ioergensen : S.T.; A useful cloning vector for 
Bacillus subtilis. Plasmld 30:312 (1093)}. 

25 Example 1, Construction of synthetic 10R tail-variant genes with Savinaaa signal 

A synthetic ICR gene (10RS) encoding a S2A protease denoted 10R from 
NocanfiOpsis sp. NRRl 18282 having the amino acid sequence shown In SEC ID NO; 43 {WO 
01/58278) was constructed, which has the nucleotide sequence shown in SEQ ID NO; 1, This 
synthetic gene was fused by PCR m frame to the DNA coding for the signal peptide from 

30 8AV1NASB™ (Novozyrnes) resulting & the coding sequence Sav-IORS which is shown In 
SEQ ID NO:. 3. Several tali-variants of this construct were made. Compared to the Ssv~10RS 
protease encoded by SEQ the tall vaffam construct Sav-IORS HV0 was constructed 

to have 8 amino acids extra in tie C^rmmus: QSHVQSAP (SEQ ID MO: 3) which were 
encoded by the following DN& sequence extension inserted In front of the TAA stopoodon of 

35 SEQ ID NO: 2: 

(SEQ ID NO: 4); caafcgcaigttcaatcogctcca 
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Tall variant Sav-IORS H¥1 was oenstruofed to fm® 4 amino acids extra in the C- 
tenrslnys; QSAP (SEQ ID NO: S) t with the foiiewg DMA sequence extension inserted in front 
of the TAA stopeodon; 
5 (SEQ ID HO: 6); caatcggctefit 



Tall variant Sav-IORS HV3 was oenstracfied to have 2 amino acids extra In the C~ 
terminus: QP (SEQ ID NO: 7} with the following DMA sequence extension Inserted in front 'of 
the TAA stopeodon: 
10 (SEQ ID NO: 8); caacca 



Tail variant Sav-IORS HV2 was constructed to have one amino acid extra in the C- 
termimis: P (SEQ ID NO: 0) with the following DMA sequence extension Inserted in front of the 
TAA stopcodon; 
IS (SEQ ID NO: 10): cca 

The 1QRB gene and Die four tail-variant encoding genes were integrated by 
homologous recombination info the Bacillus subtil® MB 1053 host cell genome. 
Chloramphenicol resistant franstbrroants were checked for protease activity on 1% skinvmilk 
20 IB-PQ agar pistes {supplemented with 8 pg/mi chloramphenicol). Some protease positive 
colonies were further analyzed by DNA sequencing of the insert to ensure the correct gene 
DNA sequence, and five strains, each comprising one of the above constructs, Were selected 
and denoted, respectively: B.$ubt®$ Sav-IORS, B.&ubtffls Sav-1 ORS HVO, RsuMWe Sav-IORS 
HV1, B.suhtlSs Sav-IORS HV2 and B.subffis $av~10RS RV3. 

26 

Example % Fermentation yields of 1QR tail-variants with Savinase signal 

Fermentations for the production of the fail-variant enzymes of the Invention were 
performed on a rotary shaking table in 500 ml baffled Erlenmeyer flasks each containing 100 
mi TV supplemented with 6 mg/i chloramphenicol. 

30 Six Edenmeyer flasks for each of the five & subiiiis strains- from example 1 were 

fermented in parallel. Two of the six Erienmeyar flasks were incubated at 3?*C {250 rpm}, two 
at 30'C (250 rpm), and the last two at 28*C (250 rpm). A sample was taken from each shake 
flask at day 1, 2 and 3 and analysed for proteolytic activity. The results are shown in tables 1- 
3, As It can be seen from tables 1 ~3 } the effect of the tails is a surprisingly high improvement 

35 on the expression level of the protease, as measured by activity In the culture broth. The effect 
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Is most pronounced at 26*0 and 30 S C, feyt is also evident at 3? 9 C as an effect observed 
especially at the early stage of the fermentation. 



Table 1 : Relative proteolytic activities at 37 S C, 



5 



I Day 1 | Dm 2 


Day 3 i 


Sav-IORS ! i,0 ! 1 s 0 


1,0 i 


Sav-10R$ HVD 





0,7 


0,8 ! 


Sav-IORS HV1 




1.3 


1.2 i 


Sav-10RS HV2 




0,8 


0,4 | 


Sav-IORS HV3 


5,3 


1,4 


1,7 i 


Table 2: Relative proteoly 


:ic activitie 
Dav1 


s at 30?C, 
Day 2 


Day 3 j 


Sav»10RS 


1,0 


1,0 




Sav«10RSHV0 


1,7 


9 0 


2,9 j 


Sav-IORS HV1 


4,8 


3,1 4,9 


Sav«10RS HV2 




1,3 




Sav-IORS HV3 


48 


3,0 


4*4 j 


Table 3; Relative proteolytic activltis 


a at 26* C> 






Day 1 


Day 2 


Day 3 I 


Sav-10RS 


1,0 


1,0 


1A. J 


Sav-IORS HVO 


1,8 




3,1 j 


Sav-IORS HV1 i 2,6 


3,6 


4,3 


Sav-IORS HV2 j 1,8 


2,6~ 


ISO 


Sav-IORS HV3 




; 3,5 


4,6 | 



Example 3, Chromosomal integration of tail-variant genes 

The following construct was used for the chromosomal integration of the tail-variant 
encoding genes. The coding sequence of the well-known subtiHsin BPfT protease was 
operationally I inked to a triple promoter, a marker gene was fused to this (a speetindmyoin 

15 resistance gene surrounded by roaofvase res-sites), and pectate lyase encoding genes from 
Bac&m bwMMs were fused to the construct as flanking segments comprising the 5' 
polynucleotide region upstream |yfmD-ytmC-y&nB-yfniA--Pal~start], and the 3' polynucleotide 
region downstream lPel~end-yfiS-dtS(start)| of the tali-variant encoding polynucleotide, 
respectively. The integrailona! cassette was made by the joining of several different PGR 

20 fragments. After the final PCR reaction the PGR product was used for transformation of 
naturally competent .8. suMfe ceils. One clone denoted PL3598-37 was selected and 
confirmed by sequencing to contain toe correct construct 



The PL3598-37 clone thus contains the following: 
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1 . The flanking regions 100% homologous to region of the B.subtilis genome (appears as 
the upstream fragment yteD-^mC^yfmB^mA^Pejs^ and the downstream fragment 
PePend^S-dtSCstart)), 

2, The Spectinomycln resistance gene flanked by Resolvase sites (res), 

5 3. The triple promoter region plus Ctylim mRNA stabilising leader sequence. 
4, The BPN* Open Reading Frame. 

A PGR fragment comprising the irtegfatlona! cassette for a BPH' library was 
10 constructed, thus operabiy linking a triple promoter (as described in WO 99/43836; 
Novozymes) to a BPH' expression cassette from a Bacilias strain. The triple promoter is a 
fusion of an optimised Badh'us amyL-derived promoter (as shown in WO 93/10249; 
Novosymes) with two promoters scBAN and cryiilA, where the first is a consensus version of 
the BaoBius amyioiiquelaohns amylase BAN promoter, and the latter Includes a mRNA- 
15 stabilising sequence (as described m WO 99/43835; Novozyroes), Suitable primers can foe 
derived from the publicly available sequences (Vasantha, N, ef ai. Genes for alkaline protease 
and neutral pMmm :from Bacilias. amyioHqmfaGims contain a large open reading frame 
between the regions coding for signal sequence and mature protein, *J. Baeteriel. 1 69:81 1 
(1884) BMBL: accession No. K02496), A Kpnl and a Sal? restriction site was Introduced to 
20 Hank the FDR fragment at each end, using the primers: 

#252639 (SEQ ID NO: 11): catgtgcatgtgggfaccgcaacgttcgcagatgotgctgaagag 
#251992 (SEO ID NO: 12): catgtgcatgtogtcgaocgatlatggagcggattgaacafgcg 

2S The Kpnl and Sail restriction sites In the PGR fragment were subsequently used to 

clone the fragment into a KpnPSaii digested Peel-Spec PGR fragment. The PechSpee 
fragment comprises a Speetlnomycin resistance gene inserted In the middle of the B.sabtiHs 
Fecials lyase gene plus approx. 2,3 kb of upstream genomic DMA and approx. 1,7 kh 
downstream genomic DMA, The Peel-Speo fragment was produced by PGR amplification of 

30 genomic DMA from the 8. subiBis strain MB1 053. using the primers: 

#1 79541 (SEQ ID NO: 13}; gcgtEgagacgcgoggccgogagcgccgttlggctgaafgatac 
#179542 (SEQ ID NO: 14): icgttgagscagctogagcagggaaaaatggaaccgctftttc 

35 Constructio n of MBtOSS 
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The ^81 OSS B.subtiiis strain was constructed by deletion of the pectaiaiyase (Pel) 
gene through integration of a PGR product into a Wild-type B.subtSis iypestrafn genome. This 
was achieved by a PGR amplification .-of genomic DMA directly downstream and upstream of 
the Pectate lyase gene of the 8*mbM$, 
S The ends of the genomic DNA. directly preceding and proceeding the Pel gene were 

elongated through primer Insertion of sequences being 100% homologous to DMA sequences 
defined by the ends of a third PGR fragment encoding a marker gene surrounded by 
Resolvase (Res) sites. In this particular case the marker gene (Spec) conferred resistance to 
spectinomycin, and it was situated between two Res sites, altogether present on the plasmtd 
10 pSJ33S8 (described In US patent No. 5,882,888), Three different PGR fragments mm initially 
produced. 

Fragment 1 ; this fragment covers from the ymiD gene to the middle of the Pel gene and 
introduces an overhang to the Res-Spec-Res cassette at the Pel gene. The size of fragment 1 
IS is 2.8 kb. The fragment was produced by a PGR amplification chromosomal DNA from the 
BrSitbffls strain PL230S, using the primers: 

#179841 (SEO ID NO: 13), and 

#179539 with overlap to #179154 Spec primer (SEQ ID NO: 15): 
20 •scafttgatcagaaftcactggcxgtcgMacaaccaftgcggaaaatagtcatsggcatcC 

Fragment 2: this fragment covers from the middle of the Pel gene to after the end of the CitS 
gene and Introducing an overhang to the Res-Spec-Res cassette at the middle of the Pel 
gene. The size of fragment 2 is .2.3 kb. The fragment was produced by a PGR amplification of 
25 chromosomal DNA from the BsuMlis strain FL2.308, using the primers: 

#179542 (SEQ ID NO: 14), and 

#179540 with overlap to #179153 Spec primer (SEQ ID MO: 18): 

ggatccagatctggtacccgggtctagagfcgacgcggcggttogcgtooggacagsaca 

30 

Fragment 3: this fragment contains the Sl^otimmyotn gerte surrounded by Res sites and DNA 
sequences in the ends overlapping wife PGR fragment 1 and 2. The size of fragment 3 Is 1,6 
kb. Fragment 3 was produced by PGR amplification of piasmid pS J3358, using the primers: 

35 #179154 (SEQ ID NO: 17): g^agra^a^g^g|g|!at^.al^ai^ 
#179153 (SEQ ID NO:- i8): : 'ca^f<^^ac^^aca^ggta<x^at<^agatc 
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j8Mri^il^,.OOMIt^3§iot^..Pfi^. taction 

Forth© PGR. amplffi^tkms of fragment 1-3 the HiF! Expand^ PGR system (Roche) 
was used together with the following Cycling scheme: 
5 5 pi Buffer 2 

14 pSdNTPs (1.25 mM each) 
2.5 ud 20 plvt primer 1 

2.S p! 2QyM primer 2 
x ui water 

10 To this mix 3 pi of DNA (apx. 100 ng) and 0.75 $ Enzyme mix (use hot start) is added. 
Total volume is 50 pi. 
The cycling profile is; 

1 cycle of 120 sec at 04*0 

Break. 

1 5 1 0 cycles of 1 5 sec at 9CC, 80 sec at 60"C S 240 sec at 72 V C> 

SO cycles of 1 6 sec at 84" C s 80 see at 60*0, (180 sec at 72*0 add 20 sec pr cycle) 
1 cycle 600 sec at 88"C, 

The three PGR fragments were made and joined in later JOINING-POR reactions. TM-^rsfc 
20 PGR fragments were single sharp bands and no gel purification was necessary. Only 
Qiagen™ PGR purification was performed prior to the following JOINING-FOB, 
JOINING of fragment 1 * 3 (same procedure for fragment 2 * 3): 
5 pi Buffer 2 

8 pi dNTP's (1.2S mM eat*}) 
25 5.0 pi Fragments 

5.0 pi Fragment 1 
9,25 pi wafer 

1 cycle of 120 sec at S4"C. 
30 Break.. Add Enzyme 

1 0 cycles of 15 sec at 94*0, 80 sec at 60"C. 240 sec at 72'C. 
Break, Add Primers 

15 cycles of 1 5 sec at 94"C S 60 sec at 80"G f (iS0 sec at ?2"C add 20 sec pr cycle) 
1 cycle 800 sec at ere. 

35 

After the first cycle at §4*0 for 120 sec there is a break, where 0.75 pi Enzyme mix Is added. 
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Total volume is now 45.0 pi. 

After the vsMal 10 cycles, %m& is another break m the cycling and for fragment 1*3: 2,5 pi 
(20yM #178541} and 2.S p! (20 \M #173153} are added and for fragment 2*3: 2,5 pi (20pM 
#1 79542} and 2,5 pi (20 pM #178154) are added and the cycling is continued for 1 5 cycles 
5 more. 

The PGR products were then gei purified: The size of fragment 1*3 should foe 3.4 kb 
and the size of fragment 2+3 should be 3,4 kb. These two fragments were Joined in a last PGR 
reaction (Expand™ long system, Roche): 

10 

5 pi Buffer 1 

14 pi dNTFs (1 ,25 rm each) 
6.0 pi Fragment 1+3 
S.O Mi Fragment -2+3 
15 17.75 pi wafer 

After the first cycle at 94*C for 120 sec there is a break., where 0,75 pi Enzyme mix is added. 
Total volume is now 45.0 pi. 

After the initial 10 cycles, there is another break in the cycling and 2,6 pi (20pM #179541) and 
20 2,5 pi (20 pM #179542) is added and the cycling is continued for 15 cycles more. 

1 cycle of 120 sec at 94*0. 
Break, Add Enzyme 

10 cycles of 15 sec at 94 ' C , 60 sec at 60"C S 240 sec at 88 V C, 
Break. Add Primers 

25 15 cycles of 1 S sec at 04"C, 80 sec at 8Q*C, 1 80 sec at 88*C add 20 sec pr cycle 

1 cycle 800 sec at 88'C. 

The size of the joined PGR fragment is 6,8 kb. This PGR fragment was purified using 
a Qiagen™ PGR purification kit, and 5 pi of the SO p! eluted DMA was used to transform a 

30 standard B.suh&is strain. After- transfemiatten ceils were spread onto i,6PO~120pg/ml of 
spectinomydn. Next day more than 1000 colonies ware seen, 8 of these were checked using 
PGR primers from last JOINING PGR amplification yielding PGR fragment of 8.8 kb rather than 
the 5.2 kb expected if deletion had net occyrred. Purtnsrmore, the pectateiyase activity of the 
clones was checked with the yanciri! Immunoassay which showed no reactivity towards the 

35 peelatelyase activity. This taken together with the Spec resistance toils us that deletion had 
occurred. One such clone was selected and denoted MB10S3, 
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InsedlpjipXjBP^^ res-spec-res in MS 1 053 

The ligation mix of the digests PGR amplified triple promoter BPM' expression 
cassette and the Kpnl-Sal digested Pee! -Spec PGR fragment was used as template in a PGR 
S amplification using the PGR primers #179541 and #179542. This resulted in a PGR fragment 
of approx. 9 Kb, which was used to transferal B.mMMs PL1801 (Dlderichsen, 8 et at 1990. 
Cloning of aldB. which encodes alpha-aeefoiaotate decarboxylase, an exosnsym® from 
Bacillus brevis. J. Bacterid,, 172. 4315-4321} competent cells. The transformed cells were 
plated on LB-120 pg/ml Speotinomycin agar plates with skim milk, Spectinomyosn resistant 
10 colonies with large skim milk clearing zom® were restreaked on Spee&nomycin agar plates 
and analysed for the integration of the PGR fragment with PGR using the primers #170641 
(SEQ ID NO: 13} and #179542 (SEO ID NO: 14). 

Appearance of a 8 kb fragment indicates that the PGR fragment has bean integrated 
into the host cell genome. Several of these clones were sequenced to confirm integration of 
15 the expression cassette, one such clone was selected and denoted Pl.3598-37, 

Example 4 Construction of plasmid-feome chromosomal Integ rational cassette 

An BcoB plasmid-boroe integrations! cassette for a library may be consbtioted in 
vivo. An Integration cassette to he used according to the method of the invention may be 
20 present on a E.coti piaanisd (which Is capable only of replication in Ecoii) not in 3,subUlm}, the 
plasm id comprisi ng: 

i) The DNA sequence encoding the PreT^ro-domains of the suhtilissn protease 
commonly known as Savtnase, preceded fey and opera bly linked to 

if) a DNA sequence comprising a mRNA stabilising segment derived in this particular 
25 case from the Cryilia gene; 

III) a marker gens (a chloramphenicol resistance gene),, and 

to) genomic DMA from BaoSim subtMs as S' and 3' flanking segments: The 
••homologous S' polynucleotide region upstream of the polynucleotide [yfmD-ytmC-yfm8-yfm.A- 
Pel-start], and the 3' polynucleotide region downstream of the polynucleotide [Pel-end~yfiS~ 
30 citS(stari)], respectively. 

The cassette was made by several cloning steps Involving digestion of pUG19 
piasmid and PGR fragments with appropnafe restriction endoriuciease sites of several different 
PGR fragments in the generally used piasmid pOSIS. After each ligation of a PGR fragment 
into a piasmid, the ligation mixture was transformed Into electrosompetent OHSalpha E.co/r 
35 cells thai were prepared for and transformed by elecfroporation using a Gene Pulsar™ 
eiecfroporator from BIO-RAO as described by the suppifer. One final piasmid construct was 
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confirmed by sequencing to contain the correct construct as outlined above, and it was 
denoted pMBISGS. 

The pM81508 piasmld thus contains the foilowlng: 
5 I) The C0HA tnRNA stabilising leader sequence including a rihosorne binding 

sequence (RSS), operationally linked to 

is) DMA encoding the Fre~Pro-domains of the subtilisin commonly known as 
Savinase, including Kpnl and Noll sites for cloning; 

lis) The chloramphenicol resistance operon; 
10 lv) The S ! downstream flanking region |Pei-end~yflS-GitS(startyj which is 99-100% 

homologous to the region of the B.subtSis. 

The four elements listed were cloned in the pUC19 vector (Isolated from £co#ATCC 
37254; Vialra J, Messing J. The pUC pfasmkfe, an M13mp7-derlved system for Insertion 

1 5 mutagenesis and sequencing with synthetic universal primers. Gene 19; 259*268, 1982.) in the 
EooRI and Sell sites to give pMBISOB. In order for the resulting plasmld to integrate afaetently 
to a specified site of in B^subUfis genome, a new strain was established. The new strain is a 
derivative of Bacillus subtil is 188 80SC accession number; 1A1 188 trpC2 . The strain was 
made competent and transformed as described above. Using elements from the PL3S98-37 

20 clone described above, the new Integration strain denoted M81510 was established and 
characterised to contain the following elements from PL3598-37: 
i) The triple promoter and the m.RNA stabilising element, 

II) Flanking segments comprising the following homologous polynucleotide region 
|yfniD~ytmC-yfi-nS-yfTnA»Pel-sfart] upstream of the triple-promoter, and the polynucleotide 
25 region [Pei-end-yfi$~cltS{stari}j downstream of the rnRMA stabilising element. 

Thus, when using M81$1-0 competent cells, it is possible for the pMBISOS (or 
derivatives thereof) to directly integrate into the genome of MSI 510 whore the two flanking 
regions in fusion with the triple-promoter and mRMA stabilising element is located, resulting In 
30 a construction where the Incoming PrePro encoding DNA of pM81'508DNA has been 
integrated in the correct reading frame with the fripehpromoter, the mRNA stabilising element 
and the R8S. Thus resulting; In high expression of the integrated gene from the promoter 
elements already present on the genome of MB1510. 

Transformation efficiency was established for the B.subh'iis strain MB1510 
35 transformed with Eco// prepared piasfnld pf$81508 4 For further testing of the potential of using 
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this approach, the Savioass encoding gem ol Baciik's ciausii was PGR amplified using the two 
PGR comers: 

Primer #31? (SEQ ID ^Gri9)tigcgoaateggaecatgggg 

Primer #139 Noil (SEQ 10 NO; 20) cst^g^g^gcegc^te^Ggftgccgcttctgcg 

The resulting ~0.8 kb of the Savinase fragment and the pMB1508 piasmid are 
digested with Kpnl and Not!, and the resulting fragments are then purifiled by agarose gel 
electrophoresis. The two fragments are Slgated, and the ligation mixture is used to transform 
competent E co// cells which are then plated on 18-agar plates or placed in liquid media for 
growth overnight ai 37*C; both types of media containing SO-IOOpg/ml of Ampiciln. After 
Incubation, a piasmid prep is made of the liquid culture. The purified piasmid is used for 
transformation of competent sells of MB1510 (using 100-10.000 ng of piasmid per 
transformation. The transformed cells are plated onto TV medium with 2% skimmilk and 6 
pg/ml of chloramphenicol for selection. After overnight incubation at 3?*C clearing zonm 
appear around those colonies wherein the integration cassette is integrated properly into the 
celts, Indicating high Savlnase expression. 

This approach can also be used to make highly diverse libraries of any gene of 
interest expiressable In B.s«fcf///s, where rather than a. gene encoding one enzyme any 
expressable polynucleotide Is inserted into the piasmid pM8150S and integrated Into the 
ilvSSISIO strain for subsequent screening. 




The piasmid pM81508 has the following components. Indicated by basepalr positions: 

BP 5186-395: pUC19 sequence from Eeoff clone ATCC 37254, Vieira J. Messing J. 
The pUC ptasmkk an Ml3rnp7-derivsd system for insertion mutagenesis and sequencing with 
synthetic universal primers. Gen© 19: 269-268* 1982. 

BP 398-1021: EcoR I cloning site CBF39S-401) and the CryHIA rnRNA stabilising 
element. {Described in WO 9634983--A1) 

BP 1022-1412: Encodes the Pfe-Pr© sequence of Savinass and the Noil cloning 
site. (Pre-Pro part described In eg. WO S623073-A1, the Noti site and the spacing between the 
Pre-Pfo and Naft was introduced by the PGR primer. 

BP 1413-2512: The 30 II cloning site {BP1413~1418J and the Chloramphenicol 
acetyl-transferase operon of pD^1050 {Described in eg. Dsderichsen.B.; Pouisen.G.B,; 
Joergensen,S.T.; A useful cloning vector for Baolllus subtiils. Piasmid 30:312 (1993)), 
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BP 2513-5185: The polynucleotide regter* |PeM>d~yflS-ci!S{start)] downstream of 
the pelB locus of the BmMiHs genome, (as It appeaars from the publication and corresponding 
database of: F. Kunst, N, Ogasawara, t Moszer, <148 other authors> H. Yoshikawa, A. 
Danchin. "The complete genome seguencB of the Gram-pssltlve bacterium Bacillus subtfe" 
5 Nature (1 807) 390:249-256), 

MB 1510 has the following specific features in and around the pels locus; 
I) The triple promoter and the mRNA stabilising element including a RBS (Rsbosome binding 
10 sequence).. 

ii) Flanking segments comprising the following homologous polynucleotide region |ylrnD~yfm€~ 
yfmB-yfmA-Pal-siart] upstream of the triple-promoter, and the polynucleotide region fPel-eod- 
yf]S~citS(start)l downstream of the mRMA stabilizing sequence. 

IS Sequence of M B1510 genomic integration region. (S£Q ID. MQ; 22) 

BP 1-2873; corresponds to -sequence of Bacillus sObfite genome yfe:D~yimC~yfmB- 

yfr«A-PeS~start (as ft appeaars from the publication and corresponding database Of: F. Kurisi et 

at 'The complete genome sequence of the Gram-positive bacterium Bacillus subtilis" 

Nature (198?) 390:249-258). 
20 BP 31 02-4082; The triple promoter and CrylSIA mRNA stabilising element plus RBS. 

(Described above in PL3598-3? construct). 

BP 40S3-5T18: The polynucleotide region lPehend-yflS-citS(sfart)| end of and 

downstream of the pelB locus of the B,sitbfflis genome (as It appeasrs from the publication and 

corresponding database of; f\ Kuost N. Ogasaw&ra, I. Moszer, <14S other authors>, H. 
.26 Ybshikawa, A, Danchin. "The complete genome sequence of the Gram-positive bacterium 

Bacillus suMP" Nature (199?) 390:249-296), 

Example S, Construction of a 2 amino-acid fail-variant library 

This example shows the construction of a tail-variant library, In this library two amine 
38 acids were introduced at the C-temsinal of the 10R protein. Such a Tall-library may be made 
with the method described above using the following PGR primers in a PGR reaction using 
genomic DMA from B,suMBi$ 1QRS as tomplate 

1605 (SEQ ID NO: 23): gacggccaqtgaattogataaaagtgc 
36 1608 (SEQ ID NO; 24); csagaMctatnkthktgtacggagtotaactocccaagag 
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Wherein H * A, C, G or T; and K * T or 8. 

The resulting PCR product was digested with £ooR I and Bgi ii ana lighted into EcoR 
I and Bp/ II digested pMBI'SpS. Hereafter following the principle described above, 

5 

Chloramphenicol resistant Bacillus syb&is immfatmmte were picked by a robotic 
colony picker from a bsoassay plats and transferred into a 384 well microliter plate (MW) 
containing 0.05 X JY supplemented with 8 mgil chloramphenicol {SOulrweli). The MTPs were 
Incubated at 26*C for 72h. After Incubation eash well was analyzed for proteolytic activity. 

o 

The thirty Bacillus mbtSis transformants with highest proteolytic activity were 
selected for determination of the two tail amino adds in each transformanf by DNA 
sequencing, the sequencing results are summaries in table 4 and table 5. 



AA Tail 


Mo< of trarssfomianfs 


TL 




TT 


4 


QL 


3 






LP 

_ 


3 




2 


10 


a 


OP 


2 




2 


LT 


1 


TQ 


1 


" If 


1 










Total 


30 



IS 

Table 4: column one shows the amino add secjuenoe of the tail, and column two shows the 
number of Bmillm suMUm transformers sequenced with that particular AA tail sequence. 
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Possibilities position 1 


Result 


Possibilities position 2 


Result 


K 


8 


K 


0 


R 


0 


R 


0 


T 


14 


T 




! 


3 


! 


4 


Q 


S 


0 


5 


P 


3 


P 


a 


L 


4 


L 


7 


Total 


30 


Total 


... , 

30 

: 

*™.sv..,™«v .™-J 



Table 5; The table shows the amino acid which could fee introduced by the primer used for the 
library construct and the actual findings by DMA sequencing of the thirty colonies Isolated from 
screening,. 

S 

Example M, Construction of Bacillus subillm strains L2 f 12 NVt>, and L2 HVi 

A Bacillus subtilis strain was made analogously with the construction of the Bacillus 

■wbiffis strain 10RS, with the DNA coding for the pro-form of the 82A protease from 

NoGardlopsis timsonvifiel mb&p. Dassotwillm DSW 43235, denoted 12, fused by PCR in fame 
10 to the DMA coding for the signal peptide from SAVINASE™ (a well-known commercial 

protease derived from Bacillus clauslh available from Novozyroes, Denmark), the resulting 

strain was denoted Bacillus subtilis Sav-L2. 

The DNA sequence Including the ceding region for the pro-mature S2A protease 

from NocarcMopsis daasQWiiteimhsp- Dansonvsliei DSM 43235, as amplified with primers 1423 
15 and 1475, is shown in SEQ ID NO: 25, The corresponding encoded pro-form amino acid 

sequence for the 12 protease is shown in SEQ ID NO: 28, 

1423 (SEQ ID MO: 26): gcttttagttoafcgatcgcatcggctgotccggcccccgtcccccag 
1475 (SEQ ID HO: 27): ggagogga%aacatgcgattaggtccggatcctgacaccccag 

20 Two taii-¥ahants of this construct wore also made. Tall variant Sav~L2 HVO was 

constructed to have 8 amino acids extra in the CMerminus: QSHVQSAP (SEQ ID MO: 3), by 
using the DMA sequence extension inserted in front of the TAA stopeodon which Is shown in 
SEQ ID HQ: 4, Tail variant Sav-L2 HVI was constructed to have 4 amino acids extra in the C- 
terminus: QSAP (SEQ ID NO: S), by using the DNA sequence extension inserted In front of the 
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TAA stopcocks which Is shown In SEQ Sp NO: 8. Both tar variants had the SAViNASE™ 
signaPpep-tide encoding sequence fused In frame with the pro-mature encoding sequence, just 
like in 8av-~l2. 

The Sav~L2 gene and tie few tail-variants ■ Sav-L2 HVO and Sav-L2 HV1 were 
Integrated by homologous f&o&Mmtim on the BmMius sabffi® MSI 053 host cell genome as 
outlined above. Chloramphenicol resistant transfeimahts were checked for protease activity m 
1% skim milk L8-PG agar plates (supplemented with 8 pg/rni chloramphenicol). Seme 
protease positive colonies were further analyzed by DMA sequencing of the insert to confirm 
the correct DNA sequence, and one strain for each construct was selected and denoted 
B,suMSis Sav~L2< B.mbWis Sav-12 HVO, and B.subfflis Sav-L2 HV1, respectively. 

Example 7. Fermentation yields of the Baeffius strains of example 8 

The three & mbitiis strains of example 8, were fermented on a rotary shaking table 
in §00 ml deified Erlenmayer flasks containing 100 mi TY supplemented with 8 nig/i 
chloramphenicol. Six Erlenroeyer flasks for each of the time B. subWts strains were fermented 
in parallel, Two of the six E'rlenmeyer flasks were incubated at 37'C (258 rpro), two at 30 8 C 
(280 rpro), and the last two at 20*P (250: rpm), A sample was taken from eabh shake flask at 
day 1, 2 and 3 and analysed for proteolytic activity. The results are shown in tabids 8*8. As ft 
can foe seen from tables 8-8. the effect of the falls also increases the expression level for the 
Sav~L2 protease from Nocardiopsis amsotwiH&i subsp. QasmwiM DSM 43238 when 
expressed in & subtil®. An Increase of up to 40% Is observed In this experiment, but overall 
improvement is observed for both tail-variants at all three temperatures tested. 



Table 8, Relative proteolytic activities at 37*0. 



] Pay.1. ! 


Day 2 J 


Day3 j 


8av»L2 


\ to 1 


1,0 I 1,0 1 


Sav-L2 mi 


I 1,4 


13 i 1,2 j 


Sav-L2 HVO 


\ 1,3 1 


11 i i,4 j 


Table 7. Relative proteolytic activities at 30 a C 




| Day 1 ! 


D>a^ 2 j 


Day 3 j 


8av4„2 


1 1,0 \ 


10 


\Zm \ 


Sav-L2 HVt 


I io 


1,2 


14 I 


Sav~L2 HVO 


I 1,1 


1 13 i 


13 I 


Table 8. Relative p? 


oteoiytio activities at 2S*C. 
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Pay 3 1 


Sav-L2 i 1.0 
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S3V-L2 HV1 
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Example 8. 10R fail-variants with heterotogous proHregions l« .Saclfte 

The DNA sequence coding for the pro-region from the 12 protease from 
Nzmrdiopsts dmsowhW subsp, Dmsmv$& > PSM 43238 is shown in SEQ ID HO: 29, and 
S the corresponding amino acid sequence is shown in SEQ ID NO: 30, A Bacifim subffis strain 
denoted L210R, similar to the BadBus subiM® strain 10RS, but with the DNA coding for the 
pro-region of the 12 replacing the pro^regspn of 1QRS, was made. The entire 121 OR protease 
encoding sequence fncL the pro-region of 12, Is shown in SEQ ID NO: 31 . 

Two tail variants of the above construct were also made. Tail variant HVO was 

10 constructed to have 8 amino adds extra in the C-terminus: QSHVQSAP (SEQ ID NO: 3} with 
the DNA shown in SEQ ID NO: 4 Inserted in front of the TAA sfopcodon of the encoding 
sequence, Tail variant HV1 was constructed to have 4 amino acids extra in the C~terminus; 
QSAP (SEQ ID NO; 5} with the DNA sequence shown in SEQ ID NO; 8 inserted In front of the 
TAA sfopcodon of the encoding sequence. 

IS The 10RL2) construct and the two tail variants were Integrated by homologous 

recombination on the Bacillus subtills MS1053 host cell genome. Chloramphenicol resistant 
transformants were checked for protease activity on 1% skim milk LB-PO agar plates 
(supplemented with 8 pg/rni chloramphenicol). Some protease positive colonies were further 
analysed by DNA sequencing of the insert to eonfimi the oorrect DNA sequence* and a strain 

20 for each donstruct was selected, and denoted 8.$tibffis L210R, B.subffls L210R HVO, and 
B.«yfefe L210R HV1, respectively. 

Example 9. Fermentation yields of 10R tail-variants with heterologous pro-region 

The six 8. subiilis strains 10RS, 1DRS HVO, 10RS HV1 , 121 OR, 121 OR HVO, and 

28 L210R HV1, were fermented on a rotary shaking table in 500 mi baffled Erlenmeyer flasks 
containing 100 mi TV supplemented with 6 mg/l chloramphenicol Six Erlenmeyer flasks for 
each of the 8. subblm strains were fermented In parallel. Two of the six Erlenmeyer flasks were 
incubated at 37*C (250 rpm) 5 two at 30*0 (2S0 rprn), and the last two at 28«C (250 rpm). A 
sample was taken from each shake flask at day 1, 2 and 3 and analyzed for proteolytic activity. 

30 The results are shown in figure % and In tables 9-1 L As it can be seen from the results, the 
effect of the exchange of the proreglon from 10R with the proreglon from the L2 protease 
resulted in a surprisingly high improvement 'eh '$|& : e&qps$s$fon level of the 10R protease as 
measured by proteolytic activity in me culture broth at 37*G. The effect is most pronounced in 
the two tall variants.. 

35 

Table 9. Relative proteolytic activities at 3? a G. 
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I Day 1 | Day 2 | 


PssyjS | 


10RS ! 1,0 { 1° \ 


1,0 


1 0RS HVO I 3,7 ! 8,9 i 




10RSHV1 j 3,9 ! 8,5 ; 


4^3" 


L210R 


| 1 ,9 


[ 2,3 i 


j 1,6 ! 


121 OR HVO 


i 5,3 


I 14.4 


7,3 j 


ta t OR HV1 


i 9,1 


i 20,9 





Table 10, Relative proteolytic activities at 3Q*C. 
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Estarnpte .14 Repeat of examples 1-9 with other 10R-!ske proteases 

•Completely analogously with the above examples 1 through 9, similar experiments 
are carried out with the proteases of the following Noa&rtfiopBte strains: 
(a) Homrdhpsk dassonvW&i NRRL 18133 as described In WO 88/03947; 
10 (b) Nocardiopsis sp> NRRL 1B2S2 as described in WO 88/03947, the DNA and amino add 
sequences of the protease derived from Nooardbprn sp. NRRL 18282 are shown in DK 
patent application no, 1988 00013, and WO 01/58278 describes the use In animal feed of 
aoid-stable proteases related to the protease derived from Nooardsopsis sp. NRRL no, 
18282; 

15 (c) Nomrdhpsis Alba DSM 1584?; the amino acid sequence of the protease is SEQ ID MO: 
33, the encoding nudeotfde sequence is SEQ ID NO: 32; the gene is isolated from the 
genomic DMA of this strain oy POR^mpiification using fee two primers; 
1421 (SEQ ID NO: 34): gttcatcgatcg&ateggotgcgaccggccccotccoccagtc 
1604 (SEQ ID NO: 35): gcggatcctatcaggtgcgcagggtoagacc. 

20 (d) Nocardiopsis ptasina t>BM 15848; fie amino acid sequence of the protease is SEQ ID NO: 

37, the encoding nucleotide sequence is 8E0 ID NO: 36; the gene is isolated from the 

genomic DMA of this strain by PCR»amplifteatfon using the two primers: 
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1346 (SEQ ID NO: 38): gUcat^aicgcateigcs^ceacc^gacoactcccccagtc 
1602 (SEQ ID NO: 39); goggaiccMttaggtccggagacigacgccccaggag, 
(a) Nocardiopsi&prmim DSM 1S849; the amino acid ssquso^ of the protease Is SEQ ID NO: 
41, the encoding nucleotide sequence Is SEQ ID MO: 4Q; the gone Is Isolated from the 
s genomic DMA of this strain by PCR-ampilflcation using the two primers: 

1803 (SEQ ID NO: 42): gitGatogatcgcatcggdgcoaceggacoactoccccagto, and 1602 (SEQ ID 
NO: 39), 

Example vwe monogastric performance of a 10R~flk© protease from PSM 4323S 
10 The performance of the AfcC8n#ops& dassonvtii&i subsp&ms dassonviliei DSU 4323S 

protease assayed in a monogasthe in vitro digestion model. The performance of a purified 
preparation of the mature part of the protease having SEQ ID NO: 28 (prepared as described 
above) was tested in art in vitro model simulating the digestion In monogastnc animals, in 
particular, the protease was tested for Its ability to improve sosuhsiisailon and digestion of 
1S ma&e/~8BM (maize/soybean meal) proteins. In the tables below, this protease is dasignated 
"protease of the invention,'' 

Thai? Wro system consisted of IS flasks in which maizeZ-BBy substrate was Initially 
Incubated with HCi/pepsin - simulating gastric digestion - and subsequently with pancreatln - 
simulating Intestinal digestion, 10 of the flasks were dosed with the protease at the start of the 
20 gastric phase whereas the remaining flasks served as blanks. At the end of the intestinal 
Incubation phase samples of m vitro digesta were removed and analysed for solubised and 
digested protein. 



Table 12: Outline of in vitro digestion procedure 



Components added 




Temperature 


Time 
course 


Simulated digestion 
phase 


10 g malze/-S8M substrate 
(6:4), 41 mi HCi (0.1 05M) 


3,0 




t«0 min 


Mixing 


5 mi HOI (0,1 OSM) / pepsin 
(3000 U/g substrate) . 1 mL 
protease of the invention 




40*O 


t~30 min 


Gastric digestion 


18 ml H*G 


~ao 




t» 10 hour 


Gastric digestion j 


7m!NaOH{0,39M) 


'si 


"4CTC 


t~1.5 nouns 


Intestinal dlgeston j 


5 ml NaHCOj (11) f 
pancreatin (8 mg/g diet) 


6.8 


40 ; G 


t~-2 0 hours 


Intestinal digestion 


Terminate Incubation 


7.0 


4C : C 


t~8,6 hours 
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Condit i ons. 

Substrate: 4 g SSP, 8 g masse (premixed) 

pH: 3.0 stomach step/ 6,8-7.0 Intestinal step 

S MCI: 0, 1 0S M for 1 .5 hours (Le. 30 mm Ncfesufesfrate premixlng) 

pepsin: 3000 U /g diet for 1 hour 

pancreatin: 8 mg/g diet for 4 hours 

temperature: 40*0. 

Replicates; 5 

ID 

Solutions 

0,39 m mm 

0.105 M HGI 

0.1 OS M HGI containing 6000 U pepsin per 5 ml 
IS 1 y hlaHCG 3 containing 16 mg pancreatin per mi 
125 mM NsAc-buffer, pH 8.0 



Enzyme protein detemii nations 

The amount of protease enzyme protein {in wbM follows, Enzyme Protein is 
•20 abbreviated BP) is calculated on the basis of the A m values and the amino acid sequences 
(amino ado' compositions} using the principles outlined In S.C.Gi & P.H. von Hippel, Analytical 
BloqhemMry 182, 319-328, (1 989), 



Experimental proc edure for in viim model 
25 The experimental procedure was according to the above outline, pH was measured at 

time 1, 2,5, and 5.5 hours, incubations were terminated after 8 hours and samples of 30 mi 
were removed and placed on ice before eentrifugatlon (10000 x g, 10 msn, 4*G), Supematants 
were removed and stored at ~20*C. 



30 Analysis 

All samples were analysed for % degree of protein with the OPA method as well as 
content of solybifised and digested protein using gel flnation, 

OH .determjnatlpg..oi.the OSA~rnMhod 
35 The Degree of Hydrolysis pH) of protein in different samples was determined using an 

semi-automated niierofster plate based cotenmetric method {Nielsen^PA'i; Petersen, D.; 
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Dambmann.C. Improved method for determining mod protein degree of hydrolysis. J. Food Scl 
2001, 86, 642-648), The OPA reagem was prapared as follows; 7.620 g ds~Na tetrafoorate 
deeahydrate and 200 mg sodiumdodecyl sulphate (SQS) were dissolved in 150 mi deionized 
water. The reagents were completely dissolved before continuing. 180 mg o-phthahdlafderryde 
5 97% (OPA) was dissolved in 4 ml ©thanok The OPA solution was transferred quantitatively to 
the above-mentioned solution by nosing with deiooized water. 176 mg dltbiothreltoi 99% (DTT) 
was added to the solution that was made up to 200 ml with deionized water. A serine standard 
(0.9516 meqv/l) was prepared by soiubilfsfng SO mg serine (Merck, Germany) in 500 mi 
defonized water. 

10 The sample solution was prepared by diluting each sample to an absorbanee (280 nm) 

of about 0.5. Generally, supernalants were diluted (100 *) using an automated Tscan dilution 
station {Mannedarf, Switzerland). All other spectrophotometer readings were performed at 340 
nm using deionized water as the control. 28 pi of sample, standard and blind was dispensed 
Into a microliter plate. The miero-fitar plate was inserted into an IEMS MP reader (Labeystems, 

15 Finland) and 200pJ of OPA reagent was automatically dispensed. Plates were shaken (2 min; 
700 rpm) before measuring absorbanoe. Finally, the DH was calculated. Eightfold 
determination of all samples was carried out; 

20 The content of solubiiisad protein in supernatant® from in vitro digested samples was 

estimated by ejuaniifying crude protein (CP) using gel filtration HPLC. Supernatanfs were 
thawed, filtered through Q.45 pm polycarbonate filters and diluted (1:50, y/v) with H 2 0. Diluted 
samples were ehromatographed by HPLC using a Superdex Peptide PE {7.5 x 300 mm) gel 
filtration column (Global). The eluent used for isocratlc elution was 50 mM sodium phosphate 

25 buffer (pH 7,0) containing 150 triU NaCl The total volume of eluenf per run was 26 mi and the 
flow rate was 0.4 ml/mlri. Elution profiles were recorded at 214 nm and the total area under the 
profiles was determined by integration. To estimate protein content from integrated areas, a 
calibration curve (R-~O.O0§3} was made from a dilution series of an in vitro digested reference 
maize/~SBM sample with known total protein content. The protein determination in this 

30 reference sample was carried out using the Kjefdahl method (determination of % nitrogen; 
A.O.A.C. (1984) Official Methods of Analysis 14th ed. ; Washington DC). 

The content of digested protein was estimated by integrating the chromatogram area 
corresponding to peptides and amino acids having a molecular mass of 1500 Dalton or below 
(Savoie.t; Gauthier.S.F. Dialysis Gel Per The fn~vif.ro Measurement Of Protein Digestibility. J. 

35 Food Set 1.986, 51, 404-498; Babmszky,L.- ya0 s Dj&d.M.; Boer,H.; Den t H.L.A. An Irwitro 
Method for Prediction of The Digestible Crude Protein Content in Pig Feeds J Scl. Food Agr. 
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10 



15 



1990, m, 173-178; 8aisen t S.; Eggum,B,0. Critical 'Evaluation of irt-vitro Methods for 
Estimating Digestibility In Slrnple-Stomtoh Animate, Motion Research Reviews 1991, 4, 141- 
162). To determine the 1500 Palton dividing Mm, the gel filtration column was calibrated using 
cytochrome C (Boehringer, Germany), aprottnin y gastrin % and substance P (Sigma Aldrich, 
USA), as molecular mass standards. 

Results 

The results shown in Tables 13 and 14 below indicate that the protease increased the 
Degree of Hydrolysis (OH), as well as soluble and digestible protein significantly. 

Table 13; Decree of '.Hydrolysis {DH)K.^PMtear^i^M^te 








Of total 


proteto 


Relative to blank 


guzyrne 

(dosage In mg £P/kg 
feed) 


n 


%DB 




%DH 




Blank \5 


26.84 




100.0 




£.....>/ j 


Protease of the Invention 
(100) 


5 


28,21 


* 1 0.35 


105.1 


b 


1.25 



Different letters within the same column indicate significant differences (1~wayANOVA% Tukey- 
Kramar test, P<0.G5), SD ~ Standard Deviation. %CV ~ Coefficient of Variance ~ (SP/maah 
value) x 100% 

Tabie 14. Solubijised arid digested crude protein measured , by,,MXA HPJLO., 



Eraxyms 
(dosage in 
i mg EP/kg 
feed) 


n 


Of total protein 


Relative to blank 


CP 


SD 


%so$.€F 


SD 

__ 


%dIg,CF 


i : 

CV% j %sol,CP | CV% j 


j Blank 




54,1 ! 8 


1.1 


00 ] 


a 




100.0 | 




2.0 j 100.0 




Protease of 
the invention 
1(50} 

\ ■ 


5 


5?.7 b 


1.1 


93.2 


» 

"g — 


1.4 
O.S 


108,7 | 




1.9 1 103.4 


tH 

fc i 1.5 I 

»TgF"1 
i i 


\ (100) 




56,3 j e 


0.3'"" 




Different letters with 


n the same column indicate 


significant di 


rferences (1-way ANOVA, Tuk 



Kramer test, P<0.05), SD « Standard Devsatlon. I€CV ~ Coefficient of Variance « (SD/mean 
value) x 100% 



.•;,V.' 
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Example 12, Iff vitro &qmmU.m& performance of lOR^ike protest fmm DSW 43235 

Performance of the protease from Momrd&psis dassmwiliei sobsp. dmmnviM DSM 
43235 in an aquacuiture m vHm medei. The protease preparation as described in Example 3 
was tested in an aquae&fture m vMm model simuiating the digestion in coldwater fisn. The m 
S vitro system consisted of 15 flasks in which SBM substrate was Initially incubated with 
HCi/pepsin - simulating gastric digestion - and subsequently with pancreailn - simulating 
intestinal digestion. 10 of the flasks were dosed with the protease at the start of the gastric 
phase whereas the remaining 5 flasks served as blanks. At the end of the intestinal incubation 
phase samples of in vitro dsgesta were removed and analysed for solubiiised and digested 
10 protein. 



] Components added 


pH 


Temperature 


Time 
course 


Simulated digestion j 
phase 


j~10 g extruded WM substrate, 

aa mi net {o,;i5SM)/pep^ih 

(4000 U/g substrate), 1 ml of 
the protease of the invention 


TcT~ 


_™ 


t-Omsn 


Gastric digestion 

! 


7 ml NaWIOlf 






t~6 hours 


Intestinal digestion j 


5 ml NaHCQs (1 M) / panereaf in 
(8 mg /g diet) 


8.8 





t~7 hours 


Intestinal digestion j 


| Terminate incubation 


7,0 i 15 S C 

1 


t~24 
hours 


„„„„„„ 



20 




C ondition s 

10 g extruded SBIvl 

3,0 stomach step/ 6.8-7.0 intestinal step 
0. .1 55 M for 6 hours 
4000 U /g diet for 8 hours 
Pancreatin: 8 mg/g diet for 17 hours 
Temperature; 1S 5 G 
apiieates: 5 



HCI: 
Pepsin 



ielutigoi 
1.1 M NaOH 
25 0.1 SS M HCI I pepsin (4000 U/g diet) 
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1 M MaHCOs containing 16 mg pancreatlH/mL 
125 mM NaAc-buffer, pH B,Q 

Experimental prpcedjre.for aqua in vitro model 
5 The experiments! produce was according to the above outline. pH was measured at 

time 1 , 5, 8 and 23 hours. Incubations war© terminated after 24 hours and samples of 30 ml 
wars removed and placed on Ice before dentrifiigatidn (13000 x g, 10 mitx OX), Supernatants 
were removed and stored at -2Q a C. 

10 Analysis 

Ail supernatant were analysed using the CPA method (% degree of hydrolysis) and by AKTA 
HPLC to determine solusilised and digested protein (see monogastffc example). 

Pre-treatment of in vitro supernatants with EASY. SPE columns 

1 § Before analysis on AKTA HPLC supernatants from the in vitro system were preireatsd 

dsing solM-phasa sample purification. This was done to improve the chromatography and 
thereby prevent unstable eiution profiles and baselines. The columns used for extraction were 
solid phase extraction columns (Ohromabond EASY SPE Columns from Mach&ray-Nagel), 2 
ml miliiQ water eluted through the columns hy use of a vacuum chamber (vacuum 0,1 § x 

20 100 kPa). Subsequently 3 mL in vttm sample was dispensed onto the column and eiufed 
(vacuum 0,1 x 100 kPa), the first % ml of eluted sample was thrown away and a clean tube 
was placed beneath the column, then the rest of the sample was ©luted and saved for further 
dilution. 

25 Results 

The results shown in Tables 18 and 17 below indicate that the protease significantly 
increased Degree of hydrolysis and protein digestibility. 

Table 16: Degree of Hydrolysis (PHI measured b y the OPA method, abspjutej^ 
30 values 







Gf total protein | Relative to blank 


Enzyme 

(mg EP/kg diet) 


n 


%DH 


| SD j %DH 




%cv 


Blank 


5 


2130 


s ] 0.52 100.0 


S 


2.42 


Protease of the invention (50) 


5 " 








'T'ocf 
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Different letters within the same column indicate skjrfifieant differences {1-way ANOVA, Tukey- 
Kramer test, PO.05). SD ~ Stendard Deviation, %^ ~ Coeffident of Variance ~ (SD/mean 
value) x 100% 

5 Table 17: S ofubiilsed and digested crude protein measured by AKTA HPLC, absolute and 
relative values 



Enzyme 
(mg EP/kg 
diet) 


N 


Of total praiein 


Relative to blank 


%CP 
dig 




SD 


%CP 
sol 




SD 


%CP 
dig 




%cv 


%CP 
sol 




%cv 


Blank 


5 


50.0 




2E 


89.9 




3.2 


100.0 


St 


4,8 


100.0 


s 




Protease of 
the 

invention 
{§0} 


_ 


"ml"' 




1 






TIT 


104.3"" 




2.1 






~ 1 ? 


! (100) 




53,4 






916 \ a 


1,0 


107,0 




0.7 


101.9 







Diffemnt letters within the seme column- indicate significant differences (1 --way ANOVA f ukey- 
Krarher test, PO.GS), SD » Standard Deviation. %CV ~ Coefficient of Variance » : {SD/mean 
value} x 100%. 



10 

Example 13. F ermeutafcion a nd a ctivlty o f 1 0R t asl-varlants T Q a M T P with S avinase 
Signal 

Two of the B. sttbWis strains of Example 5, strain 209 with tbe amino acid tail-variant 
TO, and strain 21 1 with the taii-vanant TP, together with B.$ubm$ Sav-IORS, ware firmented 

15 on a rotary shaking tabic in §00 ml baffled Ertenmeyer flasks containing 100 mi TY 
supplemented with 8 mg/i chloramphenicol Twelve Erlenmeyer flasks for each of the three B. 
subtifts strains were fermented in parallel Pour of the twelve Erlenmeyer flasks mm Incubated 
at 37 9 C (250 rpm), four at 30 W C (250 mm), and the last four at 26 a C (260 rpm). A sample was 
taken from each shake flask at day 1 } 2 and 3 and analyzed for proteolytic activity. The results 

20 are shown in fables 18 to 20 below. 

As ii can be seen from tables below, the effect of the 2 amino add tails is a 
surprisingly high Improvement on the yield of the protease, as measured by activity in the 
culture broth. The effect of the 2 amino acid falls Is comparable to the effect observed for Sav- 
10RS HV1 and Ssv-IORS HV3 In Example 1, 

25 
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fable 18: Relative oroteolvtic aslMfes at 37*C. 



IORsgit-15 



211 



1,0 



7,2 



7.0 
7,7 



I 1,0 



Table 19; Relative proteolytic activities at 30*0. 



r l i _ 


2 


a | 


[lORsvnt-IS 


1,0 | 1,0 


1*0 




4,5 


3,6 


49 I 


\m 


4,0 


4,1 


5,0 ] 



5 Table 20: Relative proteolytic activities at 28*C. 



- i 1 


2 " ' 


3 | 


10R s\ 


rot-is 


\ 1,0 


1,0 


: 1.0 ] 


209 




.! 6,4 


4,3 




211 




! 3,7 


4,1 


'4,0 | 



Example 14. %r$ihetse shuffled tOR-Jika- protease tail-variants with signs! 

A synthetic tail variant 10R protease encoding gene, denoted G-B4AT-22, was 

10 constructed with a signal peptide, and Sue 8 amino add C-termtna! tali of SEQ ID NO: 3, and 
introduced Into a BadSus host for expression. A surprisingly high yield of protease was 
achieved (data: not shown). The full coding DMA sequence of G--MAT-22 is shown in SEQ ID 
NO: 44, and the encoded pre-pro-protsase is shown in SEQ ID NO: 46. The G-mat-22 
protease Is an atpha-Sytie protease-like enzyme (peptidase family 81 E - old notation: S2A), 

is This protease has a higher temperature optimum {at pH 9} than the 10R protease, as shown in 
Figure 1 . 

Example 15s Shuffled Pro-sequences of 18fMStc« Proteases 

Recombination of protease genes can he made independently of the specific sequence 

20 of the parents by synthetic shuffling as described In Ness, J.E. et al 2002 [Nature 

Biotechnology, Vol.. 20 (12), pp. 1251-1258, 20021, Synthetic oligonucleotides degenerated in 

their DNA sequence to praNi&£h«?im8^£f-*l a" rrsno a elds found in the set of parent 

proteases are designed and the genes assembled according te the reference. The shuffling 

can be carried out far the full length sequence or for only part of the sequence and then later 

25 combined with the rest of the gene to give a full length sequence. 

In this example the amies add sequence for the Pro-pepfide part of the parent 

proteases given in SEQ ID NO: 28; SEQ ID NO: 33; SEO ID NO: 37; SEQ ID NO: 41; SEQ ID 

NO: 43; and SEQ ID NO: 45 Is encoded by a set of oligonucleotides and the resulting shuffled 
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gene fragments am combined into fheoenlextof the full length protease gene, which then 
consists of DMA coding for the signal segy«nce s the (shuffled) Propeptide, and In this case the 
mature protein of 10R prolease, Exampies of shuffled Pro~pepiide sequences am shown in 
SEQ ID NO: 48 (0^2,19), SEQ ID NO: 47 {(3-2,73), S£G ID NO: 48 (0-1,43), SEO ID NO: 49 
5 (0-2.8), SEQ ID NO: 50 (G-2.5), SEQ ID NO: 81 {0-23), SEQ ID NO; §2 (0-1 A), and SEQ ID 
NO: S3 (G-1 .2). 

The complete protease encoding genes were inserted into the genome of B.subti!is by 
homologous recombination as described above, and the proteases expressed in shakeflasks 
using a rich media, The fermentation was carried out for 5 days at 3G~C and the supernatant 

10 isolated by centnfygation prior to measuring the protease activity. As contra! a B.$ubtiim clone 
expressing the wild type protease 1GR from Nocardiasis sp. NRRL 18282 from an identic*;! 
construction protocol was fermented under the same conditions. The protease activity was 
calculated and is presented in the table below relatively to the activity of the wild type 10R 
protease, Clearly the heterologous pro-regions provide an advantage over the native pro- 

1 5 region of the 1 0R protease. 



Table 21: Relative activity of 1 0R protease expressed with heterologous shuffled propeptides. 



Example 18. in vivo monogasfric performance of fail-variant 10R-HV1 

This example describes a dose/response study with the four amino acid tail variant 
HV1 of the ICR protease In the monogastjio in vitro modes using 10, 25, 50, and 100 mg 
EP/Rg, and using 10R protease as benchmark or control. The fail variant 10R-BV1 was 
25 constructed to have 4 amino acids extra in the 04erminus: Q8AP (SEQ ID NO: 5) as 
described above. 
In vitro conditions: 

Substrate; 4 g S8M, 8 g nwe (premixed) 

pH; 3,0 stomach step/ 6.8-7,0 intestine step 

30 HCI: gJOSMfbrlj hours (I.e. 30 min HQ-substrate premixing) 



ReL activity 



10R 
G-1. 2 
G-1 .4 
G*2.3 
G~2,4 
8-2-5 
G-2,6 
G-2,7 



10 
2,9 
1,4 
1,8 
3,4 
3,0 

1,8 
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Incubation: 



Pepsin: 
Panereaim: 



3000 U /g diet for 1 hour 
8 mg/g diet for 4 hours. 
40*0. 



Replica: 



5 



tsnzymes: 



10R protease: 



FFE-2003-00047; batch FPAa-1400; 154 mg EP/g product 



Fraezedried 1GR-HV1 FFE-2003-000??; 370 m§ product 

100 mg SP/kg diet -1 mg EP/flask » 1 mg EP/mi 

(1 mg EP/mL * 10 ml)/ 154 mg EP/g product ~ 0.0649 g 

Prepare 10 ml: Disolve 0.0649 g sh&yme in 10 ml H&Ac buffer. 

IS ^jufelC;.10R,H yi 100 mg EP/kg,gMl 

100 mg EP/kg diet -1 mg EP/flask => i mg HP/ml 

(1 mg EP/mL * 20 ml}/ 370 mg EP/g product - 0.05405 g 

Prepare 20 mi: DisoSve 0.0541 g enzyme in 20 mi N&Ac buffer. 

20 Solution D: 1QB-HV1 , Mmg EP/k« diet: 

50 mg BPMQ diet -0,50 rng £P/fiask via 1 ml ~ 0.50 mg EP/mi 

Pr$p«flfc 10" ml; Dilute C 2 times: 5 ml solution C * 5 mi 125 mM NaAc^buffsr 

SokjoQ ,E,„1 0R-HV1 . & M&££M.<m 
25 25 mg EP/kg diet -0,25 mg EP/fiask via 1 mi ~ 0.25 mg EP/mi 

Prepare 12 ml: Dilute C 4 times; 3 ml solution C + 9 ml 125 my NaAo«buffer 

Solution F: 1QR-HV1. 10 mg EP/kg diet 
25 mg EP/kg diet -0.25 mg EP/flask via 1 ml m 0.25 mg EP/mi 
3D Prepare 10 ml: Dilute C 1 0 tinm? 1 ml solution C * $ mi 125 rnM NaAo-buffer 

Substrates: 

Pramfx (40% SBM / 60% maize), FFS-2002-0012! 

The 10 g sample contains 6 g maize and 4 g BBU giving a calculated protein content of 23,48 
35 % of protein (~ 2.35 g/llask). 
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CJeoiicajs:. 

4.005 M HCi, AT-1-OQ061/2S 
4.007 M mOH } AT-1 "0000.2/36 
Panereatln FFE-20Q2-0QQ82, SxUSP 
5 Pepsin FF&-2GG3-00C48, 471 UMg 



Prepare 500 ml: 

48-97 ml 3.982 M NaOH, fill with mifilQ to 800 rat. 

10 HCH solution 0,105 M 
Prepare 2000. ml: 

52.43 ml of 4.005 HCI, fill with miiO up to 2000 mL 



;i/pepss n) solu tion; 0-105 m containing. 30 0\ 
Prepare 260 ml: 

15 Take out appro*. ISO ml torn the HQ-solution, add 3.18 g pepsin and fill up to 250 mi. with 
the HCI solution. 



1 25 rnM N^p^feuffer^pH 6.0: 

Prepaid from a 2 U MaAc-buffer (KLu 04-07-2003/lab book 14189 p. 104} 
20 ---> 1 2.5 ml 2 M NaAc-buffer, fill up to 200 ml with millQ 



l^arfcreatin. dissolved in 1 M NaHGOg containing 8 mp ganoreatin/p diet: 
MaHCOg-pancreatin is pre mads, divided Into portions and frozen. Made 29-04-2003 and 
frozen, it is slowly thawed in refrigerator over night. The stock preparation is described in lab. 
2S hook 14165 page 088. 



Bgw scheme; 

In the Prefixing phase (t~0) t 10 g substrate is mixed with 41 ml HCI1; then in the 
gasfrai phase (l~30 min) S ml HCl-2 {HCI/pepsIrs) * 1 mi enzyme (or buffer) is added, and later 

30 (t»1 h) the pH Is measured and 18 mi water Is added; end then In the intestinal phase h 30 
min) 7 mi 0.39 NaOH is added; sad/lalef (t*2$ -S' ml NaHCOS/panomatin. is added and the pH 
Is measured again twice {t~2h 30 min & t~Sh 30 mln); and finally (l~8h) 30 ml suspension is 
sampled for centrifugation. Each supernatant is immediately and carefully removed from the 
centrifuge tube info glass tubes.. The sypernatents are spilt in two allquots for further analysis, 

35 Results are shown in table 22. 
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Table 22: Treatment of samples In the monogasMe m v#© model 



Sample 


cmiynH? 
Solution 




pH 


/kg diet: 




mg/g diet; 


1 - S 


1 mi 
Suffer 


Blank 


3.0 


Omg EP 


3000 


8.0 


6 - 10 


1 ml 
Solution A 


(FFE-20Q3~0004?) 


3.0 


100 nig EP 


3000 


8,0 


11 - 15 


1 ml 
Solution C 


10R-HV1 
(PKA2287J) 


3.0 


100 mg EP 


300Q 


8.0 


16-20 


1 ml 
Solution D 


10R-HV1 
(PPA22S73) 


3,0 


80 mg EP 


3000 


8.0 


21 - 25 


1 mi 
Solution E 


10R-HV1 
(PPA22873) 


3.0 


25 mg EP 


3000 


&0 



26-30 


1 ml 
Solution F 


10R-HV1 
(PRA22873) 


3.0 


10 mg EP 



3000 


8.0 



§: Splablfeand Dsaestjbie Protein: 

The changes In the levels of soluble and digestible crude protein In the soluble phase' 
of the hydroiysates were determined using an A&TA HPIC (Superdex 30 peptide column). The 
results ere shown in Table 23. 

At a 10R-HV1 dose of 100 mg EP/kg diet the level of Digestible protein was 
10 significantly increased by 9.8%, compared to Blank. The control 10R showed a relative 
improvement of 7.7%. With the lower enzyme concentrations (50, 25, and 10 mg EP/kg diet) 
the relative Improvements of Digestible protein were 5.7% s 3.3% and 0.7%, respectively. 

Table 23: HPLC results with IGR»HV1 and 10R showing the peroentusi changes in digestible 
15 CP and soluble CP relative to blank. Different letters on top of the ham indicate significant 
differences (1-ANOVA, Tukey, 95%), 



S9 



WO »3/1 11219 



FCr/»K2lN>4/tf*KH3I 



.2 

Epm 


5? 

n 


CP 


Or total 
SD 


pretelrv 
%soLCP 


SD 


%dl&CP 


Relative ■ 


,s biahk 
5<sxl>GP 


cv% 


Bank 


11 


54,8 




83,9 


IS 


100,0 3 




100,0 a 


1,9 


10RH71 [100] 


5 


60,2 


0,0 


88,3 


1:1 


108,8* 


1,3 


105,3" 


1,3 


10RHV1[50] 


5 


68,0 


0,5 


88,9 


0,8 


105,7 84 


1,1 


103,8- 


0,9 


lORHvipsi 


a 


58,7 


0,8 


883 


1,0 


103,3 * 


1,4 


102,9 te 


1,2 


lORBVI |10j 


5 


55,2 


1,3 


84,3 


1,9 




2,4 


teas* 


2,2 


lORpoq 


6 


58,1 


0,8 


87,1 


1.9 


107,7* 


14 


103,9 te 





The original 10R [100 mg EP/kg diet] improved the level of soluble protein by about 
4%. The effects of 10R-HV1 was slightly higher {5,3% relative increase) and significant With a 
5 dose of 50 and 25 mg EP/kg diet the relative Improvements were 3,8% and 2.9%, respectively 
and significant. With 10 mg EP/kg diet the relative improvement was 0,5%. 

Degree of Hydrolysis; 

The degree of hydrolysis (DH) was determined using the QPA m ethod. Results are 
io shown in Table 24, 



Enzyme im&BPM^l n 


Of iota! protein 

%DH SD 


Relative to 
%D 


blank 
H %CV 


Blank 


^\ 


25,88 


a ) 0,43 


100,0 


a 


1,85 


10R (FFE -2003-00047} [100] 


5 


27,19 


H 0.87 


105,0 


be 


2,46 


10R HV1 [100] 


5 


27,89 


0,36 


107,7 


v. 


1,29 


10R HV1 [50] 


S 


27,34 


I 0,57 


105,8 


'.X' 


2.08 


10R HV1 [251 


5 


28,42 


® s | 0,57 


102,0 : 


St 


2.18 : 


10R. HV1 [10] 


5 






98,6 


3 


3,78 



Table 24: Degree of Hydroiysis (DH) determined by the OPA method. Absolute as well as 
relative values are shown. Different fetters Indicate significant differences (1-way ANOVA, 
15 Tukey95%). 

Tali-variant 10R-HV1 improved DH hy ?J% S compared to Blank. With the lower doses 
(SO and 25 mg EP/kg diet) of ill© protease the improvements ranged from 5,8 - 2.0%, 
respectively, in line with previous findings. At the lowest dose {10 rng EP/kg diet] no effect was 
seen. The original 10R |10§ mg EP/kg dietf shewed improvements of 5% relative to Blank. 
20 The results of the HLPC ARIA analysis and the DH determinations clearly show that 

addition of the four amino acid (SEQ ID NO: 5) long tali to 1GR does not affect the performance 
of the 10R protease to any significant exfent 
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CLAIMS 

1. A secreted polypeptide which has protease activity, which polypeptide comprises at least 
three non-polar or uncharged polar amino acids within the last four amino acids of the O 
terminus of the polypeptide, and which polypeptide: 

6 (a) comprises an amino acid sequence which is at least 70% Identical to the amino acid 

sequence of the mature part Of the polypeptide shown in SEQ ID NO: 28; SEQ ID 
NO: 33; SEQ ID MO: 37; SEQ ID NO: 41; SEQ ID MO: 43; or SEQ ID NO: 45; 
(fa) comprises an amino add sequence which Is at least 70% identical to the amino acid 
sequence of the mature pah of the polypeptide encoded by the polynucleotide in 
10 SEQ ID NO: 1; SEQ ID NO: 2; SEQ ID NO: 25; SEQ ID NO: 31; SEQ ID NO; 32; 

SEQ ID NO: 36; SEQ ID NO; 40; or SEQ ID NO; 44; 

(c) comprises a mature part which Is a variant of the mature part of the polypeptide 
having the amino add sequence of SEQ ID NO: 2S; SEQ ID NO: 33; SEQ ID NO: 
37; SEQ ID NO: 41; SEQ ID NO: 43; or SEQ ID NO: 45 comprising a substitution, 

I S deletion, extension, end/or Insertion of one or more amino acids; 

(d) is an allelic variant of {a}, {b% ©r (c); or 

(e) Is a fragment of (a), (h), (c) s or (d). 

2. The polypeptide according to claim 1, which is a wlidtype polypeptide, an artificial variant of 
20 a wlidtype polypeptide said variant having one or more amino--add(s) added to the C-terminus 

as compared to the wlidtype, a shuffled polypeptide, or a protein -engineered polypeptide, 

3. The polypeptide according to claim 2, wherein the one or more added amino acld(s) is (are) 
non-polar or uncharged, 

2d 

4. The polypeptide according to claim 3, wherein the one or more added amino acid(s) Is one 
or mors of Q, S, V, A, or P. 

5. The polypeptide according to claim 2, wherein the one or more added amino acids are 
30 selected from the group consisting of: QSHVQSAP S QSAP, QP, TL, IT, QL, TP, LP, Tl t IQ, 

QP 5 PI, LT, TO, IT, QQ< and PQ. 

8, The polypeptide according to any of claims 1 - 5 which when expressed and before 
maturation comprises a heterologous pro-region from a different protease; preferably the pro- 
35 region Is derived from an B2A or S1E protease; more preferably Hi® pro-region Is an artificial or 
shuffled pro-region, and most preferably ills at least 70% Identical to the pro -region shown In 
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SEQ ID NO: 28, SEQ ID NO: 30, SEQ. ID NO; 33, SEQ ID NO: 37, SEQ ID NO: 41, SEQ ID 
NO: 43, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, 
SEQ ID NO: 50, SEQ ID NO: 51 SEQ ID NO: 52. or SEQ ID NO: 63. 

5 7, The polypeptide according to any of claims 1 ~ 6 which when expressed and before 
maturation comprises a heterology secretion slgnal-peptide which Is cleaved from the 
polypeptide when the polypeptide is secreted, preferably the heterologous secretion signal 
peptide Is derived from a heterologous protease. 

10 8. The polypeptide according to claim 7, wherein the heterologous secretion signal peptide 
comprises an amino acid sequence having a sequence Identity of at least 70% with the amino 
acid sequence encoded polynucleotides 1 - 81 of SEQ ID NO; 2, or SEQ ID NO: 44. 

9, An isolated polynucleotide encoding a polypeptide as defined in any of claims 1-8. 

IS 

10. A recombinant expression vector or polynucleotide construct comprising a polynucleotide 
as defined in claim 0. 

11 A recombinant host cell comprising a polynucleotide as defined in claim % or m 
20 expression vector or pdynuclecf Ide construct as defined In claim 1 0. 

12, The recombinant host ceii according to claim 11 which is a Sac#«s cell. 

13, A transgenic plant, or plant part, comprising a .polynucleotide as defined In claim 9, or an 
25 expression vector or polynucleotide construct as defined in claim 10. 

14. A transgenic, non-human animal, or products, or elements thereof, comprising a 
polynucleotide as defined in claim. 0, or an expression vector or polynucleotide construct as 
defined in claim 10. 

30 

15. A method for producing a polypeptide as defined In any of claims 1 - 8, the method 
comprising; (a) cultivating a recombinant host oeif as defined in claim 11 or 12, or a transgenic 
plant or animal as defined in claims 13 or 14. to produce a supernatant comprising the 
polypeptide, and op8cmafly05 mae^erihg thepolyp#ts#. 

35 
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16. An animal feed additive comprising at feast o^e polypeptide as defined in any of claims 1 - 
8; and 

(a) at feast one fat-soluble vitamin, and/or 

(b) at least one waiter-soluble vitamin, and/or 
5 (c) at least one trace mineral 

17. An animal feed composition having a crude protein content of 50 to 800 g/kg and 
comprising at least one polypeptide as defined in any of claims 1-8, or at least one feed 
additive of claim 18.. 

10 

18. A composition comprising at feast one polypeptide as defined In any of claims 1 - 8, 
together with at least one other enzyme selected from amongst phytase (EC 3.1.3.8 or 
3.1,3.28): xylanase (EG 3.2.1.8); gaiactanase (EC 3,2.1.88); alpha-galaciosidase (EC 
3.2,1.22); protease (EC 3.4,-.-}. phdsphollpase A1 (EC 3,1.1.32); phasphoilpase A2 (EG 

IS 3.1.1.4); lyspphosphotfpase (EC 3.1>1:5); phosph oil pass C (3.1.4.3); phosphollpase D (EC 
3.1 ,4.4); and/or heta-giucarsase (EC 3.2,1.4 or EG 3.2.1 .6). 



10. A method for using at least one polypeptide as defined in any of claims 1 - 8, for improving 
the nutritional value of an animal feed, for Increasing digestible and/or soluble protein In animal 
20 diets, for increasing the degree of hydrolysis of proteins in animal diets, and/or for the 
treatment of vegetable proteins, the method comprising including the polypeptide^) in animal 
feed, ahd/dr in a composition for use in animal feed, 

20. A method for using at least one polypeptide as defined in any of claims 1 - 8, comprising 
25 including the polypeptides) in a detergent formulation. 
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FIG. 1 
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10423 .204 -WQ. ST25.txt 
SEQUENCE LlSXXMS 



<110> Lassen , so ran i ensted 

<120> improved proteases and methods for preduclng tfisss 

<B0> 10423, 204-WO~DK 

<I60> 53 

<1?Q> p&tmtin version 3.2 

<Z1Q> I 

<2il> .1062 

<212> DMA 

<213> Nocardiopsis sp. mm. 18262 



<220> 

<22I> mi sc. feature 

<22%> a)v.(49S> 

«223> Encodes the pro- region shown in positions -165 to -1 of SEQ ID 



<220> 

<22I> mi sc„ feature 
<222> (496) . < (1059) 

<223> Encodes the mature region shown In positions 1-2.88 of SEQ ID 
no; 43. 



<400> 1 
getactggag 


cattacctca 


gtctcctaea 


r r t no o n ra o 


atocaotatic ^&tacaaoaa 


60 


geattacaac 


gtgatcttga 


tettacatca 


gctgaagctg 


agga a 1 1 a c. t tgct g c.a caa 


120 


gatscagcot 


ttgaagttga 


tgaagctgcc 


gctgaagcag 


ctggtgatgc atatggtggt 


180 


tcagtattcg 


atactgaatc 


actcgaactt 


actgtactsg 


tgaccgatgc agcagctgtt 


240 


gaagctgttg 


aagccacagg 


tgcaggtaca 


gagctegtar. 


cttatggtat tgatggatta 


300 


gatgagatcg 


taeaagaget 


taatgcagct 


gatgccgttc 


caggtgtagt tggatggtat 


360 


cctgatgtag 


caggtgatac 


tgttgtctta 


gaagttcttg 


aaggc t c tgg agctgatg 1 1 


420 


tctggaettt 


tagcagacgc 


aggagtcgat 


gcatccgcgg 


t tgaag tgac cacgtcagat 


480 


cagcctgaac 


tctatgccga 


tatcatfgga 


ggcctagcgt 


acacaatggg tggtcgctgc 


540 


agtgtaggat 


ttgcagccac 


aaatgcagct 


ggacaacctg 


gcttcgtgac agetggaeat 


600 


tgcggccgcg 


tcggtacaca 


gg ttactate 


ggcaatggaa 


gaggtgtctt tgagcaaagc 


660 


gtatttcceg 


ggaatgatgc 


tgccttcgtt 


agaggtacgt 


ccaactttac gcttactaac 


720 


ttagtatcta 


gatataacac. 


tggcggatat. 


geaactgtag 


caggtcaeaa tcaagcacct 


780 


attggetcta 


gcgtctgccg 


cteagggtcg 


actacaggat 


ggcattgtgg aaccattcaa 


840 


gctagaggte 


agagcgtgag 


etatcctgaa 


ggtaccgtaa 


cgaacatgac tcgtacgact 


000 


gtatgtgcag 


aaccaggtga 


ctctggaggt 


tcatatatca 


gcggtaegca agegcaaggc 


960 


gttacctcag 


gtggatccgg 


taactgtagg 


acaggtggca 


ea&egttcta. ccaggaagtg 


1020 


acaccgatgg 


tgaactcttg 


gggagttaga 


ctccgtacat 


aa 


1062 
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<210> 2 
<211> 1143 
<212> DMA 

<21'i> Artificial sequence 
«220> 

<223> a synthetic 10R gene (lORsynf -15) encoding a S2A protease denoted 
"10a" fused by PCR in frame to the signal peptide encoding 
sequence of a heterologous protease . Savfnase. 

<400> 2 

atgaagaaac egttggggas aattgtcgca agcaccgcae tactcatttc tgttgctttt 60 

agttcatcga tcgcatcggc tgctactgga gcattacctc agtctcctac aectgaagea 120 

gatgcagtat cgatgcaaga ageattacaa cgtgatcttg atcttacatc agetgaagct ISO 

gagg&attac ttgctgcaca agatacagcc tttgaagttg atgaagctgc egctgaagea 240 

getggtgatg catatggtgg ttcagtattc gatactgaat cactcgaact tactgtacta 300 

gtgaccgatg cagcagetgt tgaagctgtt gaagccacag gtgcaggtac agagctcgta 360 

tcttatggta ttgatggatt agatgagatc. gtacaag&ge ttaatgcagc tgatgccgtt 420 

ccaggtgtag ttggatggta tcctgatgta gcaggtgata ctgttgtctt agaagttctt 480 

gaaggctctg gagctgatgt ttctggactt ttagcagacg caggagtcga tgxatecgcg 540 

gttgaagtga ccacgtcaga tcagcctgaa ctctatgccg atatcattgg aggcttagcg 600 

taeacaatgg gtggtegctg cagcgtagga tttgcagcca eaaatgcagc tggaeaacct 860 

ggcttcgtga cagctggaca ttgcggccgc gtcggtacac aggttactat cggeaatgga 720 

agaggtgtct ttgagcaaag cgtatttccc gggaatgatg ctgccttcgt tagaggtacg 280 

teeaacttta egcttaetaa cttagtatct agatacaaca ctggeggata tgeaactgta 848 

gcaggtcaca atcaagcaec tattggctct agcgtctgcc gctcagggtc gactacagga 900 

tggcattgtg gaaccattca agetagaggt cagagcgtga gctatcctga aggtaccgta 960 

acgaacatga ctcgtacgac tgtatgtgca gaaccaggtg actctggsgg ttcatatatc 1020 

agcggtacgc aagcgcaagg cgttacctca ggtggatccg gtaactgtag gacaggtggc 1080 

aeaacgttct accaggaagt gacacegatg gtgaactttt ggggagttag actccgtaca 1140 

taa 1143 

<210> 3 

<2ll> 8 

<212> PRT 

<2I3> Artificial sequence 
<220> 

<223> €- terminal amino acid tail expressed as fusion to protease of the 
inveritlcio. 

<400> 3 

Gin sen His val 01 n ssr Ala Pro 
1 S 

<210> 4 
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<2U> 24 
<212> DMA 

<213> Artificial sequence 
<22Q> 

<223> Polynucleotide encoding a oteminal aarino add tail expressed as 
fusion to protease of the invention > 

<400> 4 

caatcycatg ttcaatccgc tcca 24 



<21D> S 

<2.11> 4 

<212> f*RT 

<2i3> Artificial sequence 
<2ZQ> 

<221> O terminal asnno acid tall expressed as fusion to protease of the 
1 nventi on . 

<4GC» 5 

Sin ser Ala Pro 
1 



<2i0> 8 

<2il> 12 

<212> DNA 

<213> Artificial sequence 



•<2B> Polynucleotide encoding a c~ terminal amino acid tail expressed as 
fusion to protease of the Invention.. 

•<400> 6 

caatcggete et 12 



<210> ? 

<211> 2 

«212> PRT 

<213> Artificial sequence 
<220> 

<22j> Oterainal amino acid tall expressed as fusion to protease of the 
i siventi on » 

<400> 7 

Sin Pro 
1 



<2I0> 8 

<2XI> 6 

<212> DNA 

<2I3> Artificial sequence 
<220> 

<223> polynucleotide encoding a e>tensina1 amino aciid tail expressed as 
fusion to protease of the Invention. 

<400> 8 

caacea £ 
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<21Q> 9 
<211> 1 
<212> PRT 

<213> Artificial sequence 
<220> 

•<223> C-terminal amino acid tail expressed as fusion to protease of the 
invention.. 



pro 
1 



<21Q> 10 

<2 XX> 3 

<2X2> DMA 

<213> Artificial sequence 



<220> 

<22.3> Polynucleotide encoding a c-ter&ioai amino acid tail expressed as 
fusion to protease of the Invention > 



<4QG> 10 



<210> 11 

<211> 4S 

<212> DMA 

<2X3> Artificial sequence 

<223> Primer #252639 

<400> 11 

eatgtgcatg tgggtaccgc aacgttcgca gatgctgctg a&gag 

,-?1>'V -!> 

< C i.V> J.< 

<21i> 44 

<212> DMA 

<213» Artificial sequence 
<220> 

<22$> Primer #251992 

<400> 12 

catgtgcatg tggtcgaccg attatggagc ggattgaaca tgcg 



<21Q> 13 

<211> 44 

<212> m& 

<213> Artificial sequence 
<22Q> 

<223> Primer #179541 

<400> 13 

gcgttgagat gegcggccgc gagcgccgtt tggctgaatg atac 



<210> 14 
<211> 43 
<212> ONA 

<2X3> Artificial sequence 

Page 4 
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<220> 

<223> primer #179542 

<400> 14 

gcgttgagac agctcgagca gggaaaaatg gaacegcttt rfcc 43 

<21Q> 15 

<211> 64 

<212> DNA 

<213> Artificial sequence 
<Z20> 

<223> Primer #179539 

<400> 15 

ccatttgatc agaattcact ggccgtcgtt ttacaaecat tccggaaaat agtcataggc 60 
atcc 64 

<zm> m 

<211> 60 

<212> DNA 

<213> Artificial sequence 
<220> 

<22i> Printer #179540 

<400> 16 

ggatceagat etggtaeccg ggtctagagt cgacgcggcg gttcgcgtcc ggacageaca 60 

<zm> it 

<?.!%> 3? 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #179154 

<400> 17 

gttgtaaaas gaeggccagt gaattetgat caaatgg 37 

<210> 18 

<211> 37 

<2I.2> DMA 

<213> Artificial sequence 



<220> 

<223> primer #179153 



}> 18 

ccgcgtcgac actagacseg ggtacctgat ct agate 3? 

<2W> 19 
<211> 22 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #317 
<400> 19 

tggegcaate ggtaccatgg gg 22 
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<210> 20 

<211> 40 

<2I2> QUA 

<21?t> Artificial sequence 
<220> 

<223> Primer #139 NotX 

■<400> 20 

catgtgcatg cggccgcatt aacgegttgc cgcttctgcg 40 

<210> 21 

<2XI> 7443 

<212> QUA 

<?J3> Artificial sequence 
<220> 

<223> sequence of plasmid pMBlSOS 

<4GQ> 21 



tcgcgegttt 


cggtgatgac 


ggtgaaaacc 


tctgacacat geagctcccg gagacggtca. 


m 


cagcttgtct 


gtaagcggat 


gccgggagca 


gaeaagcccg tcagggcgcg tcagcgggtg 


120 


ttggcgggtg 


tcggggctgg 


cttaactatg 


cggcatcaga gcagattgta ctgagagtgc 


180 


accatatgcg 


gtgtgaaata 


ccgcaeagat 


gcgtaaggag aaaataccgc atcaggcgcc 


240 


attcgcoatt 


caggctgcgc 


aactgttggg 


aagggcgatc ggtgcgggcc tcttcgetat 


300 


tac.gccagct 


ggcgaaaggg 


ggatgtgctg 


caaggcgatt aagttgggta acgecagggt 


560 


tttcccagtc 


acgacgttgt 


aaaacgacgg 


ccagtgaatt cgataaaagt gctttttttg 


420 


ttgc&attga 


agaattatta 


atgttaaget 


taattaaaga taatatcttt gaattgtaae 


480 


gcccctcaaa 


agta.agaa.ct 


acaaaaaaag 


astacgttat atagaastat gtttgaacct 


540 


tettcagatt 


acaaatatat 


tcggacggac 


tctacctcaa atgcttatct aactatagaa 


600 


tgacataeaa 


gcacaacctt 


gaaaatttga 


aaatataact -accaatgaac ttgttcatgt 


660 


gaattatcgc 


tgtatttaat 


tttetcaatt 


caatatataa tatgceaata cattgttaca 


720 


agtagaaatt 


aagacaccct 


tgatagcctt 


actataecta acatgatgta gtattaaatg 


780 


aatatotaaa 


tatatttatg 


ataagaagcg 


acttatttat aatcattaca tatttttcta 


£40 


ttggaatgat 


taagattcca 


atagaatagt 


gtataaatta tttatcttga aaggagggat 


900 


gcctaaaaac 


gaagaacatt 


aaaaaeatat 


atttgcaccg tctaatggat ttatgaaaaa 


960 


tcattttatc 


agtttgaaaa 


ttatgtatta 


tggagctctg aaaaaaagga gaggataaag 


1020 


aatgaagaaa 


ccgttgggga 


aaattgtcgc 


a&gcaccgca cteetcattt ctgttgcttt 


1080 


tagttcatcg 


atcgcatcgg 


ctgctgaaga 


agcaaaagaa aaatatttaa ttggctttaa 


1140 


tgagcaggaa 


gctgteagtg 


agtttgtaga 


acaagtagag gcaaatgacg aggtcgccat 


1200 


tctctctgag 


gaagaggaag 


ttgaaattga 


at.tgctr.cat gaatttgaaa cgattcct.gt 


1280 


tttatccgtt 


gagttaagcc 


cagaagat.gr. 


ggacgcgctt gaactcgatx eagcyatttc 


1320 


ttatattgaa 


gaggatgcag 


aagtaar.gac 


aatggcgcaa tcggtaccat ggggtatatc 
Page 6 
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<a V \..4.. i„« «Sy v. Ja v.< >,i..t.j* .j V.. '■■.} v..^^; 




ss -r / v si -rt ss i-r r 






kikfv<io<v i.yy<5. v sj v..^Kj:-^y i^^-dd d-dci ".x..y y vjv-- 


Si *T S;- Si -J- SS 

si i. .iv\. V i.el V Vd, 


.i J. c. u 


saaaccggtt ttcaggggaa 


tctgatcaca. tctgetatgt tcctgacagc 


gatggcggcg 


3.180 


aacccgctga ttgccaagct 


ggcccatgat gtcgcagg<!<? tggacttaac 


atggacaagc 


3240 


tgggcaattg ccgcgattgt 


accgggactt gtaagcttaa tcatcacgcc 


gcttgtgatt 


3300 


tacasactgt atccgccgga 


aatxaaagaa acaccggatg cggcgaaaat 


cgcaacagaa 


3360 


aaactgaaag aaatgggacc 


gttcaaaaaa tcsgagcttt ccatggttat 


cgtgtttctt 


3420 
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ttggtgcttg tgctgtggat ttttggegge agctteaaca tcgaegctae caoaaccgca 5480 

ttgatcggtt tggccgttct cttattatca caagttctga cttgggatga tatcaagaaa 3540 

gaacagggcg ettgggatac gctcacttgg tttgcggcgc ttgteatgct cgccaacttc 3600 

ttgaatgaat taggcatggt gtcttggttc agtaatgcca tgaaateate cgtattaggg 3880 

ttetcttgga ttgtggcatt catcatttta attgttgtgt attattactc tcactattte 3720 

tttgcaagtg cgacagccca catcagtgcg atgtattcsg catftttgge tgtcgtcgtg 3780 

geagtgggcg caccgccgct tttagcagcg ctgagcctcg cgtteatcag caacctgttc 3840 

gggtcaacga ctcactacgg ttctggagcg gctccggtct tcttcggagc: aggctacatc 3900 

ccgcaaggca aatggtggtc catcggattt atcctgtcga ttgttcatat eatcgtatgg 3960 

cttgtgatxg gcggattatg gtggaaagta ctaggaatat ggtagaasga aaaaggcaga 4020 

cgcggtctgc ctttttttat tttcaetcct tegtaagaaa atggattttg aaaaatgaga 4080 

aaattccctg tgaaaaatgg tatgatctag gtagaaagga cggctggtgc tgtggtgaaa 4140 

aagcggttcc atttttccct gcaaacaaaa ataatggggc tgattgcggc t'ctgctggtc 4200 

tttgtcattg gtgtgctgac eattacgtta gccgttcagc ataeacaggg agaaeggaga 4260 

caggeagagc agctggcggt tcaaacggcg agaaccattt cctatatgcc gccggttaaa 4520 

gagctcattg agagaaaaga cggacatgcg gctcagacgc aagaggtcat igaacaaatg 4380 

aaagaacaga ctggtgtgtt tgccatttat gttttgaacg aasaaggaga cattcgcagc 4440 

gcctctggaa aaagcggatt aaagaaactg gagcgcagca gagaaattlt gtttggtggt 4500 

tcgcatgttt ctgaaacaaa agcggatgga cgaagagtga tcagagggag cgcgc.cgatt 4560 

ataaaagaac agaagggats cagccaagtg atcggcagcg tgtctgttga ttttctgcaa 4620 

acggagacag agcaaagcat caaaaagcat ttgagaaatt tgagtgtgat tgctgtgctt 4880 

gtactgctge tcggatttat tggcgecgce gtgctggcga aaagcatcag aaaggatacg 4740 

ctcgggcttg aaccgcatga gatcgcggct ctatatcgtg agaggaacgc aatyettttc 4800 

gcgattegag aagggattat tgccaccaat cgtgaaggcg tcgtcaccat gatgaacgta 4860 

tcggcggccg agatgctgaa gctgccegag ceigtgatec atcttcctat agatgacgtc 4020 

atgccgggag cagggctgat gtctgtgctt gaaaaaggag aaatgctgcc gaaccaggaa 4980 

gtaagcgtca acgatcaagt gttlattalx aatacgaaag tgatgaatca aggegggcag 5040 

gcgtatggga ttgtegteag ettcagggag aaaacagagc tgaagaagct gatcgacaca 5100 

ttgacagagg ttcgcaaata ttcagaggat: cteagggcgc agactcatga attttcaaat 5160 

aagctttatg cgattttagg gctgegtega cetgcaggea tgeaa^cttg gcgtaatcat 5220 

ggtcatagct gtttcetgtg tgaaattgtt atxcgctcac aattccacac aacatacgag 5280 

ecggaagcat aaagtgtaaa gcetggggeg cctaatgagt gagctaactc acattaattg 5340 

cgttgcgctc actgcccgct ttxcagtcgg gaaacctgtc gtgccagctg eattaatgaa 5400 

tcggccaacg cgcggggaga ggcggtt-tgc :gtat*gg;g<£g -.ctcttrecget tcctcgctca 5460 
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.y t.> i,.i. i «A. y \.- t.y U£tvjy 5.,ia l.v Uv 
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'3 f*i f"^ <*" "5 i" f t'i' ft 

eiy y «. 1 1. y s. y 
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v/ frv 
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s. v tttt y I, v i « 


/yj-i' i> r i i>-^rtii i- : 




f ^ rsi ^ ft^' ^' (^"S-V'Sj" Ti a ti a a 
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y vwiy v t oiy v v 
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v, iA«d c y y C Cet, 


v.yy V-£vy *s-<J.Vs- V 
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L, v, tav cy tva 
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"** t v i" t l if i ti si ^ V 
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T 9 Ctl Ct. t! C C 
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/ £!.) V 


caaaaaaggg aataagggcg 


acacggaaat 


gttgaatact 


catactcttc 


ctttttcaat 


7260 


attattgaag cstttatcag 


gcrttattgtc 




51 1 il QB.XMX. t-t~ 


gaatgtattt 


7320 


agaaaaataa acaaataggg 


gttecgcgca 


catttcccaj 




cctgacgtct 


7380 


aagaaac cat tattatcatg 


acattaacet 


ataaaaatag 


gcgtatcacg 


aggccctttc 


7440 


qtc 










7443 
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<210> 22 
<TU> 5718 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Sequence of MB3.S10 gen^ne integration region 
<4Q0> 22 

gagegcegtt tggctgaatg atacaacagt ctcacttcct taetgegtct ggttgcaaaa SO 

acgaagaagc aaggattccc ctcgcttctc atttgtsxta tttattatae acttttttaa 120 

gcacatcttt ggtgettgtt teaetagact tgatgcctct .gsaatcttgtc caagtgtcac 180 

ggtccgcatc atagacttgt ecatttttca ccgctttgan atttttxcag agcgggttcg 240 

ttttceactc atctacaatg gttttgcctl cgttggctga gatgaacaaa atateaggat 300 

egattttgct caattgctca aggctgacct cttgataggc gttatctgac ttcacagcgt. 360 

gtgtaaagcc tagcatttta .aagatttctc cgtcatagga tgatgaigta tg&agctgga 420 

aggaatccgc txttgcaacg ccgagaacga tgttgcggtt tteatetttc ggaagttegg 480 

etiiiagatc gttgatgact tttttgtgct cggcaagctt ttcttttcct tcatcttctt 540 

tatttaatgc tttagcaatg gtcgtaaagc tgtcgategt ttcgtcatat gtcgcttcac 800 

ggctttttaa ttcsatcgtc ggggcgatrt ttttcagctg tttataaatg tttttatgge' 680 

gctcagcgtc agcgatgatt aaatcaggcr tcaaggaact gatgacetca agattgggtt 720 

cgctgcgtgt gectacagat gtgtaatcaa tggagetgee gacaagcttt ttaatcatat ?B6 

Cttttttgtt gtcatctgcg atgcccactg gcgtaatgec gagattgtga acggcatcta 840 

agaatgaaag ctcaagcaca accacccgct taggtgtgcc gcttactgtc gttttiectt 900 

cttegteatg gatcactctg gaatccttag actcgctttt gccgctttcg ttgttattct 980 

ggcttgatga acagccggat .gcaatgaggc aggcgagcaa taaaacactc atgatggcaa 1020 

tcaaettgtt agaataggtg cgcatgtcat tcttcctttt ttcagattta gtaatgagaa 1080 

tcattatcac atgtaaeact ataatagcat ggettateat gtcaatattt ttttagtaaa 1140 

gaaagctgcg tttttactgc tttctcatga asgcatcatc agacacaaat aagtggtatg 1200 

cagcgttaec gtgtcttcga gacaaaaacg catgggcgtt ggctttagag gtttcgaaca 1260 

tatcagcagt gacataagga aggagagtgc tgagataacc ggacaatttc ttttctattt 1320 

catctgttag tgcaaattca atgtcgecga tattcatgat aatcgagaaa acaaagtega 1380 

tatcgatatg aaaatgttcc tcggcaaaaa ccgcaagctc gtgaattcct ggtgaacatc 1440 

cggeacgctt atggaaaate tgtttgacta aatcactcac aatccaagea ttgtattgct 1500 

gttctggtga aaagtattge attagacata cctcctgctc gt*aeggataa aggcagcgtt 1360 

tcatggtcgt gtgctccgtg cagcggcttc tcettaattt tgatttttct gaaaataggt 1620 

cccgttccta tcactttacc atggacggaa aacaaatagc taetaccatt ectcetgttt 1680 

ttctcttcaa tgttctggaa tetgttteag gtaeagaega tegggtatga aagaaatata 1240 
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tgtttgaa cc 


3600 
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ttcggacgga 
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3660 


at g a cat aca ag c a ca a c c t 


tgaaaatttg 


aaaatataac taccaatgaa 


cttgttcatg 


3720 


tgaattatcg ctgtatttaa 




fcaatatata atatgecaat 


acattgttac 


37S0 
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aatgagaaaa 
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<210> 23 
<211> 2? 
<212> DMA 

<2I3> Artificial sequence 
<22Q> 

<223> Primer 1605 
<400> 23 

gacggccagt gaattcgata aaagtgc 27 



<210> 24 

<211> 42 

<212> DMA 

<Z13> Artificial sequence 
<220> 

<m> Primer 1606 



<220> 

<22l> mi scwf eature 

<222> (133.. CB) 

■<223> n is a, c, g, or t 

<22Q> 

<221> mis c„ f saru re 

<222> (16) ... (16) 

<223> ft is a, c, g, or t 

<400> 24 

ccagatctct stnktnktgt acggagtcta actxcccaag ag 42 



<210> 25 
<2U> 1112 
<2X2> DMA 

<2X3> Nocarcliopsis eiassonvillei »SM 43235 
<400> 25 

gettttagtt eatogatcgc atcggctgcl ccggcccccg tcccccagac cccegtegce 60 

gacgacagcg ccgecagcat g&ccgaggcg otcaagcgcg acetcgacct caccteggcc 120 

gaggccgagg agcttctctc ggegcaggaa gccgccatcg agaeegaege cgaggceaec 180 

gaggcegcgg gcgaggccta cggcggctca ctgttcgaea ccgagaccct cgaactcacc 240 

gtgctggtca ccgacgcctc egccgttgag gcggtcgagg ccaccggagc ccaggccacc SOS 

gtcgtctccc acggcaccg.a gggcctgaec gaggtcgtgg aggacctcaa cggcgccgag 360 

gttcccgaga gcgtcctcgg ctggtacccg gacgcggaga gcgacaccgt cgtggtcgag 420 

gtgctggagg gctccgacgc cgacgtcgcc gecctgeteg cegacgccgg tgtggactcc 480 

tcctcggtcc gggtggagga ggccgaggag gccccgcagg tetacgcega catcategge. 540 

ggcctggcct actacatggg cggccgctgc tccgtcggct tcgccgcgac caacagcgcc 600 

ggtcagcceg gtttcgtcac cgccggct-ac tgcggcaccg tcggcaccgg cgtgaccatc 660 

ggcaacggca ccggcacett eeagaactcg gtcttecceg gcaacgacgc cgeettcgte 720 

cgcggcacct ccaacttcac cctgaceaac etggtetcge gctacaactc cggc.ggc.tac 280 

cagtcggtga ccggtaccag ccaggecccg gceggctcgg ccgtgtgccg eteeggctcc 340 
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aetaccggct ggcactgcgg caccatccag gccegeaaec agaccgtgcg ctaccegcag 900 

ggcaccgtct actegcteac ccgcaccaae gtgtgcgccg agctcggcga ctctggcggt 960 

tegtttalct ccggctcgca ggcccagggc ftcacctccg geggctecgg caactgctcc 1020 

gtcggcggca cgacctacta ccaggaggtc acccc.gat.ga tcaactcctg gggtgtcagg 1080 

atecggaect aat.cgc.atgt tcaatccgct cc 1112 

<210> 26 

<2I1> 4S 

<212> BHA 

<213> Artificial sequence 
<220> 

<223> Primr 1423 

<400> 26 

gcttttagtt catcgatcgc atcggctgct ccggcccccg tcccccag 48 

<21Q> 2? 

<2ll> 45 

<2tZ> DMA 

<2.1.3> Artificial sequence 
<220> 

<223> PHsser 1475 

<400> 27 

ggagcggatt gaacatgcga ttaggtccgg atcctgacac cccag 45 

<2iO> 28 

<2il> 354 

<2X2> PRT' 

<213> 'Btoeardiopsis dassonvlHei OSM 43235 
<220> 

<221> PR0P6P 

<222> (1) , , (166) 

<220> 

<22I> suatupepti de 
<222> (167).. (354) 

<40D> 28 

Ala Pro Ala Pro Val Pro «ln The Pro Val Ala Asp Asp ser Ala 
-165 -160 -155 

Ala ser Met Thr Glu Ala Leu Lvs Arg Ass Leu Asp Leu Thr Ser 
-ISO -14$ " -140 

Ala olu Ala 6lo Glu Leu Leu ser Ala &n sly Ala Ala lie slu 
■■135 -130 -125 

Thr Asp Ala olu Ala Thr €lu Ala Ala -Sly Slu Ala Tyr <Hy <s1y 
-120 -IIS -HO 
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wo mwi mn vcrmKmwmiim 

10423 . 204 -WO. ST25 . txt 
Ser Leu Phe ass Thr Sly rhr teu filu Leu Thr val lsu Val Thr Asp 
-105 -108 »3X 

Ala ser Ala val sin Ala val Si u Ala Thr Sly Ala Gin Ala Thr val 
-90 -85 -80 -75 

val ser His Gly Thr Glu sly leu rhr slu val val sly Asp key ash 
-70 -OS -60 

Gly Ala Glu val Pro Sly ser val Leu sly rep Tyr Pro Asp val siu 
■55 -50 -45 

ser Asp Thr Val val val Glu Val Leu Si u Sly Ser Asp Ala Asp val. 
-40 -35 -30 

Ala Ala Leu Leu Ala Asp Ala Gly val Asp Ser Ser Ser Val Arg Val 
-25 -20 -IS 

slu Glu Ala Glu Glu Ala Pro sin val ryr Ala Asp lie He Gly Sly 
-10 -5 -11 5 

Lsu Ala Tyr ryr Met Gly Gly Arq Cys ser val sly Phe Ala Ala Thr 
' 10 15 20 

Asn ser Ala Glv sir! pro Gly Phe val Thr Ala Sly His cys sly Thr 
25 ' 30 35 

Val Gly Thr Gly val Thr lie Gly Asn Gly Thr Gly Thr Phe Gin Asn 
40 45 SO 

Ser Val Phe Pro sly Asn Asp Ala Ala Phe val Arg Gly Thr Ser Asn 
SS SO 65 70 

Phe thr teu Thr Asn Leu val ser Arg Tyr ash ser sly sly Tvr Gin 
75 SO 65 

ser val Thr Sly Thr Ser Sin Ala Pro Ala Sly ser Ala val cys Arg 
90 3$ 



Ser Gly Ser Thr Thr Gly Trp His Cys Sly Thr lie sin Ala Arq Asn 
" 10$ 110 115 

Gin Thr val Arq Tyr pro Gin Gly Thr val Tyr Ser Lsu Thr Arg Thr 
" " 125 130 



Asn val Cvs Ala Glu Pro Gly Asp ser Gly sly ser phe Tie Ser Gly 

135 * 140 ' 145 ISO 

Ser Gin Ala Gin Sly val Thr ser sly Gly Ser Sly Ass cys ser val 
155 ISO 165 
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61 v Sly Thr Thr Tyr Tyr Gin; Glu val Thr Pro -Met lie Ash Ser Trp 

i7o m 180 

Gly val Arq lis A'rg Thr 
IBS 

<210> 29 
<21I> 438 
<212> DHA 

«213> Nocardiopsis fassonv-illei dsm 43235 
•<400> 29 

gctccggccc ccgtccccca gacccccgtc gecgacgaea gcgccgccag catgaccgag 60 

gegctcaagc gcgacctcga cctcacctcg gccgaggeeg aggagcttct ctcggcgcag 120 

gaagccgcca tcgagaccga cgccgaggcc aecgaggccg cgggcgaggc ctaeggcggc ISO 

teactgttcg acaecgagac cctcgaactc accgtgctgg tcaecgacgc ctccgccgtc 240 

gaggcggtcg aggccaccgg agcccaggcc accgtcgtct cccaeggcac egagggcctg 300 

aecgaggtcg tggaggacct caacggcgcc gaggtteccg agagcgtcct cggctggtac 380 

ccggacgtgg agagcgacac cgtcgtggtc gaggtgctgg agggctccga cgccgacgte 42.0 

gccgcicctgc tcgccgaegc cggtgtggac teeteetcgg tccgggtgga ggaggccgag 480 
gaggccccgfc aggtctac 4S8 

<216> 30 

<21i> 188 

<2I2> FRT 

«i2l3> Nbcardiopsls dassonvi llei 43235 

<400> 30 

Ala Pro Ala Pro Val Pro Gin Thr Fro val Ala Asp mp ser Ala Ala 
1 S 10 13 

Ser Mat Thr Gly Ala Leu lys Arg Asp Leu Asp Leu Thr ser Ala Glu 
20 '25 30 

Ala sslts slu Leu Leu ser Ala Gin Glu Ala Ala lie Glu Thr Asp Ala 
35 40 45 

Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly ser teu Phe Asp 
SO SS ' 60' 

Thr Glu Thr Leu Glu Leu Thr Val Leu val Thr Asp Ala ser Ala Val 
65 70 75 80 

Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr Val val Ser His Gly 
85 SO 95 

Thr Glu Glv Leu Thr Glu val Val Glu Asp Leu Asn Gly Ala Glu val 

105 110 
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pro Glu Ser val Leu 61 y Trp Tyr Pro Asp val Glu Ser Asp Thr val 
IIS 120 125 

Val Val sly val teu G"Su Gly ser Asp Ala Asp val Ala Ala leu leu 
130 135 140 

Ala Asp Ala Sly val Asp ser Ser ser Val Arg Val .61 g .61 u Ala Slu 
145 ISO 155 160 

Glu Ala Pro Gin Val Tvr 
165 ' 

<2W> 31. 
<211> 1146 
<212> DMA 

<213> Artificial sequence 

<220> 

<223> The DMA sequence coding for the pro-region of SEQ in no; 29 fused 
in frame to A1918L2 protease tail -variant encoding ge««; whole 
construct % 10&<proA19iSU} ., 



<40G> 31 








60 


atgaagaaac 


cgttggggaa 


aattgtcgca agcaeegcac taetcatttc 


* . „ ^ ^ ^ 

tgttgcTTTx 


agttcatcga 


tcgeateggc 


tgctccggcc cccgtccccc agacceccgt 


cgccgaegac 


120 


agcgecgcca 


gcatgaecga 


ggcgctcaag cgcgacctcg acctcaccte 


ggccgaggcc 


180 


gaggagcttc 


tctcggcgca 


ggaagccgcc atcgagaccg acgccgsggc 


caccgaggec 


240 


gcgggcgagg 


cctacggcgg 


ctcactgttc gacaccgaga ccctcgaact 


caccgtgctg 


300 


gtcaccgacg 


cctccgccgt 


cgaggcggtc gaggccaccg gageccaggc 


caccgtcgtc 


360 


teecacggca. 


cegasggect 


gaccgaggtc gtggaggacc tcaacggcgc 


cgaggttccc 


420 


gagagcgtcc 


tcggctggta 


cccggacgtg gagagcgaca eegtcgtggt 


cgaggtgctg 


480 


gagggctccg 


acgecgacgt 


cgccgcectg ctcgccgacg ccggtgtgga 


ctcctcctcg 


540 


gtccgggtgg 


aggaggccga 


ggaggccccg caggtctatg ccgatatcat 


tggaggccta 


600 


gcgtaeacaa 


tgggtggtcg 


ctgcagcgta ggatttgcag ccacaaatgc 


agttggaeaa 


660 


ectggcttcg 


tgacagctgg 


acattgcggc cgcgtcggta cacaggttac 


tatcggcaat 


720 


ggaagaggtg 


tctttgagca 


aagcgtattt cccgggaatg atgctgcctt 


cgrtagaggt 


?S0 


acgtcca&et 


ttacgcttac 


taacttagta tctagataca acactggcgg 


atatgcaact 


840 


gtagcaggtc 


acaatcaagc 


aectattggc tctagcgtet gecgetcagg 


gtcgactaca 


900 


ggatggcatt 


gtggaaccat 


tcaagctaga ggtcagagcg tgagctatcc 


tga&ggtacc 


860 


gtaacgaaea 


tgactcgtac 


gactgtatgt gcagaaccag gtgaetctgg 


aggttcatat 


1020 


atcageggta 


cgcaagcgca 


aggcgtt a cc. tcaggtggat. ecggtaaetg 


taggacaggt 


1 080 


ggcaeaacgt 


tctaecagga 


agtgaeatcg atggtgaact ettggggagt 


tagactecgt 


1140 


acataa 








1146 
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<210> 32 
<211> 1068 
<212> DMA 

<213> NocareHopsis Alba mn 1564? 
<400> 32 

gcgaccggcc ccctccccca gtxccccacc ceggatgaag ccgaggccac caccatggtc 60 

gaggcectec agcgcgaect cggcctgtcc eectc.tc.agg ccgacgagct cctegaggeg 120 

caggccgagt ccttcgagat cgacgaggec gecasegcgg ccgcagcega ctcetacggc 180 

ggctccatct tc.gacacc.ga cagcctcacc ctgaccgtcc tggteaecga egcetecgec 240 

gtcgaggcgg tcgaggccgc cggcgccgag gccaaggtgg tctcgcacgg catggagggc 300 

ctggaggaga tcgtcgccga cctgaaegcg gccgaegctc ageeeggcgt cgtgggctgg 360 

tacccegaca tccactecga cacggtcgtc ctcgaggtcc tegagggcte cggtgccgac 420 

gtggactccc tgetcgccga cgccggtgtg gacaccgecg ac.gtcaaggt. ggagagcacc 480 

accgagcagc ccgagrtgta cgccgacatc atcggcggtc tcgcctacae catgggtggg 540 

cgetgctcgg tcggcttcgc ggccaccaac -§.cctccg§cc agcccgggtt cgtcaeegcc 600 

ggccactgcg geaecgtcgg caccccggtc agcatcggca aeggccaggg cgtcttcgag 660 

cgttccgtct tccccggcaa cgacteegcc ttcgtccgcg gcacctcgaa cttcaeectg 720 

aceaaeetgg tcageegcta caacaccggt ggttacgcga ccgtctccgg ctcctcgcag; 780 

gcggcgateg gctcgcagat ctgccgttcc ggctccacca ccggctggca ctgcggcacc 840 

gtccaggecx geggccagac ggtgagctac ceccsgggea ccgtgeagaa cctgateegc: 900 

accaacgtct gcgccgagcc cggtgactcc ggcggctcct tcatctccgg cagccaggcc 960 

cagggegtca ectccggtgg ctccggcaac igcteettcg gtggcaccac ctactaccag 1020 

gaggtcaacc cgstgctgag cagctggggt ctgaccctgc gcacctga 1068 



<210> 33 

<211> 3SS 

<212> PRT 

<213> Nocardiosis Alba DSM 1364/ 



<220> 

<221> PROPER 
<222> (1) , - (16?) 

<220> 

<2 2 1> Ktatoept i da 
<222> C168).,C355) 

<400> 33 

Ala Thr el y Pro Leu Pro Sin Ser pro Tttr Pro Asp 61 a Ala slu 
-165 -160 -XS5 



Ala Thr Thr mt Val 6lu Ala Lew Sin Arg Asp Leu Sly Leu Sep 
-150 -145 -140 



Pro ser Gin Ala Asp Gin lm Leu Sltf Ala Gin Ala Glu Sar Phe 
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-135 -130 -125 

<3lu lie Asp Slu Ala Ala thr Ala Ala Ala Ala Asp. Ser Tyr sly 
-120 -115 -110 

sly ser lie Phe asp Thr Asp ser Leu Thr Leu Thr Val Leu val Thr 
-105 " -100 -95 

Asp Ala ser Ala val elu Ala val si a Ala Ala Sly Ala 61 u Ala tys 
' -90 -85 -80 

Val Val ser His Slv mt Slu Sly leu slu Glu Tie val Ala Asp Leu 
-75 ~?0 H5S -60 

Asn Ala Ala Asp Ala sin pro Sly val val si y rrp Tyr pro Asp He 
•-SS -50 -4S 

His ser Asp Thr val val Leu 61 u Val Leu Glu Sly Ser sly Ala Asp 
' -40 -35 -30 

Val Asp ser Leu Leu Ala Asp Ala Sly Val Asp Thr Ala Asp Val tys 
-25 -20 ' -15 

Val Slu ser Thr Thr slu sir? Pro Glu Lm Tyr Ala Asp lis tie Sly 
-10 -5 -i 1 3 

Sly Leu Ala Tyr Thr mt. Sly sly Arq cvs Ser val sly Phe Ala Ala 

10 IS: 20 

Thr Asn Ala Ser sly sin Pro sly ehe Val Thr Ala sly His eys Sly 
25 ' 30 3$ 

■Thr val sly Thr Pro Val ser He Sly Asp sly Gin sly val phe slu 
40 45 SO 

Aro ser val Phe Pro Gly Asn asp Ser Ala Phe val Arg sly Thr ser 
~ SS ' 60 SS 

Asn Phe Thr Leu Thr Asn Leu Val Ser Are Tyr Asn Thr Gly Sly Tyr 
70 75 80 85 

Ala Thr val ser sly Ser ser sin Ala Ala He sly ser sin lie cys 
90" 9S 100 

Arq ser sly ser Thr Thr Gly rrp His Cys sly Thr Val Sin Ala Arq 
105 110 ' IIS 

sly Gin Thr Val Ser Tyr Pro sin Gly Thr val sin Asn Leu Thr Arq 
120 125 " 130 

Thr Asn Val Cys Ala slu pro Sly Asp ser Gly sly Ser Phs Ha Ser 
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135 140 145 



aly ser <5lr* Ala Sin -.Sly val Thr ser -sly. <Sly- Ser sly Asn cys Ser 
150 155 160 165 



Phe Sly slv Thr Thr ryr Tyr sin Sly Val -As« Pro Met Leu Ser ser 
170 " 175 180 



xrp sly Leu Thr teu Arg Thr 
185 



<210> 34 

<211> 43 

<212> m& 

<213> Artificial sequence 
<220> 

<2.23> Primer 1421 

<400> 34 

gttcategat cgcatcggct gcgaccggcc ccctecccca gtc 43 



<210> 35 

<2i:t> 31 

<11Z> DNA 

«213> Artificial sequence 
«220> 

<?23> primer 1604 

<400> 35 

gcggatecta tc&ggtgegc agggtcagac c 31 



<210> 36 
<211> 1062 
<2l2> DNA 

•<2X3> Nocardiopsis prasina 0SM 15648 
<40G> 36 

gecaccggae egctceecca gtcacccacc ccggaggccg acgccgtctc catgcaggag 60 

gcigctccagc gcgacctcgg cetgaccccg cttgaggccg atgaactgct ggcegeccag 120 

gacaccgcct tcgaggtcga cgaggccgcg gccgeggccg ccggggacgc ctacggcgge 180 

tccgtctteg aeaccgagac cctggaactg accgtcctgg tcaccgacgc egcctcggtc 240 

gaggctgtgg aggccaccgg cgcgggtacc gaactcgtct cctacggcat cgagggcctc 300 

gacgagatca tecaggatct caacgccgcc gacgccgtce ccggcgtggt cggctggtac 360 

ccggacgtgg cgggtgacac cgtcgtcctg gaggtectgg agggttsxgg agccgacgtg 420 

agcggcctgc tcgccgacgc cggcgt-ggac gcctcggccg tcptggtgac cagcagtgcg 480 

cagcccgagc tctatgccga catcatcggc ggtctggcct acaccatggg cggcc.gc.tgt 540 

tcggtcggat tcgcggccac caacgccgcc ggtcagcccg gattcgtcae cgccggtcac 600 

tgtggccgcg tgggcaccca ggtgagcate ggcaacggec agggcgtctt cgagcagtcc 660 

atctteccgg gcaacgacgc cgccttegtc cgcggcac§t ccaaettcae gctgaccaac 720 
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ctggtcagcc gctacaacac eggcggttac gccaccgtcg ccggccacaa ecaggcgccc 780 

atxggctcct ccgtctgccg a.ccggct.cc aecaccggct ggcactgcgg. caccatccag 840 

gcccgcggcc agtcggtgag ttacecegag ggcac.egixa ccaaeatgac ceggaccact 900 

gtgtgcgccg agcccggcga ctccggogge tcctacatct ccggcaaeca ggcccagggc 960 

gtcacetexg gcggctccgg caactgccge atcggcggga ccaecitcta ccaggaggtc 1020 

acceccatgg tgaactcctg gggcgteegt ctccgg&cct aa 1062 



<2Z2> (1) » » (153) 
<220> 

<222> €166) . . 053> 
<40Q> 37 

Ala rhr Gly pro Leu Pro Gin ser Pro Thr pro Glu Ala Asp Ala. 
-165 " -160 -155 



val Ser mt Gin Glu Ala Lsu 61 n Arg Asp Leu Gly tea Thr Pro 
-150 -143 -140 



leu sslu Ala Asp Glu Leu Leu Ala Ala Gin asp Thr Ala phe Sly 
-135 -130 -125 



Val Asp Glu Ala Ala Ala Ala Ala Ala Sly Asp Ala ryr sly Gly 
-120 -115 -HO 



ser Val Phs asp Thr Glu Thr Leu Slu Leu Thr val Leu val Thr Asp 
-105 -100 -35 -90 



Ala Ala ser val Glu Ala Val 61 u Ala Thr Gly Ala Gly Thr Glu Lay 
-85 -80 -75 



Val s«r Tvr Slv Tie Glu Gly Leu Asp Glu lie lie Gin Asp Leu Am 
' -70 -65 -60 



Ala Ala Asp Ala val Pro Gly val val sly Trp tyr Pro Asp val Ala 



6ly asd Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp Val 
-40 -35 -30 



Ser sly Leu Leu Ala Asp Ala Gly val Asp Ala ser Ala val Glu val 
-25 ' -20 -15 -10 

page 2.1 



<210> 

<m> 

<2I2> 
<213> 



5 i 

353 
PRT 

Nocardfopsis prasina os» 1S648 




PROPEP 




45 



WO 2*f»4/1 1.1219 



FCr/»K2lN>4/tf*KH31 



.10423.2 



7M~m. ST2S.txt 



Thr Ser Ser Ala Gin Pro 'Sla Leu Tyr Ala Asp lie lie Sly 6'1 y Leu 
-5 -1 1 5 



Ala Tvr rhr Met S3 1 y sly Arg Cys ser val Sly Phe Ala Ala Thr mn 
" 10 15 20 



Ala Ala sly Gin Fro sly Phe Val Thr Ala Sly His Cys sly Arg val 
25 30 35 



Sly Thr Sirs Val Ser lie sly Asn sly (Sirs sly val phe 61 u Sir? Ser 
40 45 50 55 



lie Phe Pro Sly Asn Asp Ala Ala Phe val Arg Sly Thr ser Asn Phe 
80 6S 70 



Thr Leu Thr aso Leu val ser Arg Tyr Asn Thr sly sly Tyr Ala Thr 
75 SO B5 



val Ala Sly His Asn Sin Ala Pro Xle Sly Ser Ser Val Cys Arg ser 
90" 95 100 



sly. ser Thr Thr Sly Trp Mis cys sly rhr xle sir? Ala Arg Sly Sin 
105 110 115- 



ser val ser Tvr pro sly sly Thr val Thr Asn Met Thr Arg Thr Thr 
12Q 125 130 135 



val cys Ala sit* Pro sly Asp Ser sly sly ser Tyr Tie ser sly Asn 
140 145 ISO 



Sirs Ala Gin sly val Thr Ser Sly Sly ser sly Asn cys Arg_ Thr Sly 
155 160 165 



Thr Thr Phe Tyr sin Slu val Thr pro Met val Asn Ser Trp Sly 
170 " 175 ISO 



val Arg Lm Arg Thr 



<212> IMA 

<213> Artificial sequence 
<220> 

<223> Primer 1346 
<400> 13 

gttcatcgat cgcatcggct gcca 



<Z10> 39 
<211> 38 



18 
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<220> 

<223> Primer 1602 



<4QQ> 39 

gcggatccta ttaggtccgg agacggacgc ceeaggag 



3 



<210> 40 
<211> 1062 
<212> DNA 

<213> Nocardiasis prasina PSRS X564f 
<400> 40 

gccaccggac cactccccca gtcacccacc ccggaggecg acgccgtctc catgcaggag SO 

gcgctceagc gcgacctcgg cctgaceccg cttgaggccg atgaaetgct ggccgcccag 120 

gacaecgect tcgaggtcga cgaggccgcg gccgaggccg ccggtgacgc ctacggcggc ISO 

tccgtcttcg acaccgagac eetggaactg accgtcetgg tcaccgactc cgccgcggtc 240 

gaggcggtgg aggccatcgg cgccgggacc gaactggtct cctacggcat cacgggeetc 300 

gacgagstcg t-cgaggagct caacgccgcc gacgccgttc ccggcgtggt cggctggtac 360 

ccggacgtcg egggtgacac cgtcgtgctg gaggtcctgg. agggttccgg cgccgacgtg 420 

qgcggcctgc txgccgacgc cggcgtggac gcctcggcgg tcgaggtgac caeeacegag 480 

eagcccgagc tgtacgeega catcatcggc ggtctggcct acaccatggg cggccgctgt 540 

tcggtcgget tcgcggccac caacgccgcc ggtcagcccg ggttcgtcac cgccggteac 600 

tgtggccgcg tgggcaccca ggtgaccatc ggcaacggcc ggggc.gt.ctt cgagcagtcc 660 

atcttcccgg geaaegaege cgccttcgtc cgcggaacgt ccaacttcac getgaccaac 720 

ctggtcagcc gctaeaaeac cggcggctac gccaccgtcg ccggtcacaa ccaggcgcce 780 

ateggetect ccgtctgccg ctccggctcc accaccggtt ggcactgcgg caccatccag 840 

gcccgcggcc agteggtgag ctaccccgag ggcaccgtca ccaacatgae .gcggaecacc 900 

gtgtgcgceg agcccggcga ctcc.ggc.ggc tcctacatct ccggcaacca ggcccagggc 860 

gteacctccg gcggctccgg caactgccgc accggcggga ccacettcta ccaggaggtc 1020 

acccccatgg tgaactcctg gggcgtccgt ctccgoacct aa 1062 

«210> 41 

<211> 353 

<212> PRT 

<213> Nocardiopsis prasina 1S64S 

<220> 

<221> PROPER 




<220> 




400> 



41 



Page ?3 



wo mwi mn vcrmKmwmiim 

10423 , 204 -WO. ST2S .tx;: 

Ala Thr sly Pro Leu Pro Sin ser Pro Thr t>j?a Slu Ala Asp Ala 
-165 ' -160 ~1S5 

Val ser Met Sin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro 
-150 -14S -140 

leu Glu Ala Asp Glu Leu Leu Ala Ala Sin Asp Thr Ala Phe Glu 
~BS -130 -125 

Val Asp Glu Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr sly Sly 
-120 -115 



ser val Phe Asp Thr Slu Thr Leu slu Leu Thr Val Leu val Thr Asp 
-105 -100 -95 -90 

ser Ala Ala Val Glu Ala val slu Ala Thr sly Ala Sly Thr Glu Leu 
~8S -80 -?S 

val ser Tyr Glv lis Thr Sly Leu Asp slu lie Val Glu Glu leu Asn 
-70 ' -65 -60 

Ala Ala Asp Ala val pro sty val val Gly Trp Tyr Pro Asp val Ala 
•55 ' -SO ~4S : 

Sly Asp Thr val val im Slu Val Leu slu sly ser sly Ala Asp val 
J -40 -35 -30 

Glv Gly Leu Leu Ala Asp Ala Gly Val Asp Ala ser Ala val Glu Val 
-2:5 ' --20 -15 -10 

Thr Thr Thr Glu Sin Pro slu Leu Tyr Ala Asp lie He Gly sly Leu 
-5 -X X 5 

Ala Tyr Thr mi Gly Gly Arg Cys ser Val Sly Phe Ala Ala Thr Asn 
10 15 20 

Ala Ala Gly Gin Pro Sly Phe val Thr Ala sly His cys Sly Arg val 
25 ' 50 35 

Gly Thr sin val Thr ile sly Asn sly Arg sly val Phe Glu Gin ser 
40 45 50 55 

ile Phe Pro Sly Asn Asp Ala Ala Phe val Arg Gly Thr ser Asn Phe 
m 65 70 

Thr Leu Thr Asn Leu val ser Arg tyr Asn Thr Gly Gly Tyr Ala Thr 

?s so as 

val Ala Glv His Asn Gin Ala Pro il« Sly Ser* ser Val Cys Arg Ser 
90' 95 100 
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sly Ser Thr Thr Sly Trp His Cys sly Thr He <Sln Ala Arg sly Gin 
105 110 115 

ser val ser Tyr pro slu sly Thr val thr AStt Met Thr Art? Thr fhr 
120 125 130 135 

val cys Ala Glw Pro Sly Ass ser sly Sly ser Tyr lie Ser sly ash 

145 150 



Sirs Ala Sin Sly Val Thr ser Sly Sly ser sly Asn Cys Arg Thr sly- 
US 180 165 

sly Thr Thr Fhe Tyr aln slu val Thr pre Met VaT Asn ser Trp Gly 
170 175 ISO 

Val Arg Leu Arg Thr 
183 

«210> 42 

<211> 43 

<212> DNA 

<213> Artificial sequence 
<220> 

<2m> Pf'imr 1603 

<400> 42 

gttcategai cgc&tcggct gccacegeac cactccccca gte 43 

<210> 43 

<211> 353 

<Z12> PRT 

<213> Hocardiopsis sp. nrrl. 18262 
<220> 

<223> PROPER 

<222> (1) . , (165) 

<220> 

<Z 2 3> mat „pep 1 1 ck- 
<222> (166) , . (1059) 

<400> 43 

Ala Thr Slv Ala Leu Pro Gin ser Pre Thr Pre Slu Ala Asp Ala 
-165 -160 "155 

val ser Met Gin slu Ala Leu Gin Arg Asp Leu Asp Leu Thr s>sr 
-ISO -145 -140 

Ala slu Ala Slu slu Leu Leu Ala Ala Gin asp Thr Ala Phe slu 
-IBS -130 -125 

val Asp Slu Ala Ala Ala Glu Ala Ala sly Asp Ala Tyr Sly Glv 
-120 ' -115 ' -110 
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10423 > 204 -WO. ST2Sstxt 

Ser Val Phe Aso Thr 61 u Ser Ley sits Leu Thr val leu val Thr Asp 
-105 -100 -35 -SO 

Ala Ala Ala val Slu Ala Val 61 u Ala Thr Sly Ala sly Thr Slu Leu 
-85 -80 -75 

val ser Tyr sly He Asp sly tea Asp slu lie val sin slu Leu Asn 
-70 -65 -60 

Ala Ala Asp Ala val Pro sly val val Sly Trp Tyr pro Asp val Ala 
-55 -50 -45 

Sly Aso Thr val val Ley slu Val Leu slu Sly Ser sly Ala Asp val 
•■40 -35 -30 

Ser Sly Leu Leu Ala Asp Ala sly val Asp Ala ser Ala val slu val 
-25 " -20 -15 -10 

Thr Thr Ser Asp sin Pro slu Lou Tyr Ala Asp lie lie Sly sly Leu 

Ala Tyr Thr ■■Met Sly sly Arg cys ser val sly Phe Ala Ala Thr Asn 
10 " IS 20 

Ala Ala Sly Sin Pro Sly Phe val Thr Ala sly His cys sly Arg val 

25 "'" 30 35 

Gly Thr Sirs Val Thr lie Sly Asn Slv Arg Sly val Phe slu sin ssr 
40 4S 50 55 

val Phe pro sly as ii Asp Ala Ala Phe val Arg sly Thr ser Asn Phe 
60 65 70 

Thr Leu Thr Asn Leu val ser Arcs Tyr Asn Thr Sly slv Tyr Ala Thr 
75 80 " 85 

Val Ala sly His ash Sin Ala Pro lie Sly Ser ser Val cys Arg ser 
90 9S 100 

slv ser Thr Thr civ Trp His cys sly Thr lis CTsi Ala Arg slv sin 
105 ' ' 110 115 

ser val ser Tyr Pro slu slv Thr val Thr Asn «et Thr Arq Thr Thr 
120 *' 125 " 130 135 

Val cvs Ala slu Pro Sly Asp Ser slv sly ser Tyr lie ssr Sly Thr 
140 145 150 

Sin Ala Sin civ val Thr ser sly Sly Ssr sly Asn Cys Are Thr Slv 
155 160 165 



wo mwi mn vcrmKmwmiim 

10423. 204--WO.ST2S.T.XT 

;ly Thr Thr Phs Tyr sin &u val Thr Pro 'Met Vat Asn Ser Trp 61 y 

175 180 



val Arq leu Arg rhr 
ISfi 



<210> 44 

<211> 1164 

<212> DNA 

<213> artificial sequence 
<220> 

<223> synthetic protease encoding gene 
<220> 

<221> CDS 

<222> (1) - > CX16 4) 

<223> Full length protease 

<220> 

<221> sig„peptice 
<222> CD * . (81) 

•<22D> 

•<221> mi sc-feature 

<222> (S2> . . CXXS4) 

<223> Propeptide 

<220> 

<222> CS??}<<<1164) 
<400> 44 

at§ aaa aaa ecu etg qqa aaa att qtc gca age aca. gca ctt ctt 
Met Lys Lys Pro Leu sty tys Tie Val Ala ser Thr Ala Leu Leu 
-190 " -185 -180 

att tea gtg gca ttt age tea tct att qea tea «ca get aca gga 
lie ser val Ala Pbe Ser Ser Ser He Ala Ser Ala Ala Thr sly 
-!?$ -170 -165 



gat aca gaa tea ctt qaa ctt aca gtt ctt gtt aca gat gca gca gca 
aso Thr Glu Ser Leu Slu Leu Thr Val Leu val Thr Asp Ala Ala Ala 
»100 ~§S -90 

qtt oaa aca gtt gaa gca aca gga gca qua aca gta ctt gtt tea tat 
val Glu Ala val Slu Ala Thr -sty Ala Sly Thr Val Leu val ser Tyr 
-85 ~S0 ' -75: 
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45 



90 



gca tta ccg caq -tct ccq aca ecg gaa oca qat gca gtc tea atg 135 
Ala Leu Pro Gin ser pre Thr Pro Slu Ala Asp Ala Val ser '" 

-15S -150 



caa gaa aca ctq caa aga gat ctt qat ctt aca tea qca qaa gca 180 

sin Glu Ala Leu sin Arg Asp Leu Asp Leu Thr Ser Ala Glu Ala 
-145 ' -140 -135 

gaa aaa ctt ctt get gca caa gat aca gca ttt gaa gtg gat gaa 225 

Glu slu Leu Leu Ala Ala Gin Asp Thr Ala Fhe Glu val Asp slu 
-130 -125 



gca ace qca qaa qca qca aca gat gca tat gqc gee tea gtt ttt 270 
Ala Ala Ala Glu Ala Ala Sly Asp Ala Tyr Sly Sly ' Ser Val Phe 
-115 -110 -105 



318 



366 
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gqa att gat aac ctt gat qaa att gtt caa gaa ctg aat gca get gat 414 

sTv lie Asp Slv lsu asp <Sl.U lie Val Sin slu Lea Asnt Ala Ala Asp 

•70 -65 -60 -SS 

act gtt ccq qqc qtt gtt ggc tgg tat ccg .qat qtt get gt?a gat aca 462 
Ala Val Pro G'lv Val val sty ftp Tyr pre Asp Val Ala Giy Asp Thr 

-50 ~4S -40 

qtt gtc ctt qaa qtt ctt qaa gga tea qqc gca gat gtt tea ggc ctg 510 

val val Leu Slu Val Leu slu &lv ser sly Ala Asp val Ser Sly Leu 

-35 -30 -25 

ctg qca aac pea gqa qtt qat qca tea gca gtt gaa gtt aca aca tea 558 

Leu Ala Asp Ala slv val Asp Ala ser Ala val Glu val rhr Thr ser 
•20 " -IS -lO 

qat caa ccq qaa ctt tat gea gat att att ggc ggc ctg gca tat tat 606 

Asp sin Pro oiu Leu Tyr Ala Asp lie lie sly sly Leu Ala. Tyr Tyr 

■■> -1 1 5 10 

atq ggc ggc aqa tgc age gtt ggc ttt qca gca aca aat gca tea ggc 6S4 

Met Sly sly Arg cys Ser val sty Phe Ala Ala The Asn Ala Ser Sly 

15 20 25 

caa ccq ggc ttt gtt aca aca qqc cat tgc age aca gtt gqe aca cca 702 

Gin Pro Gly Phe val Thr Ala sly His Cys sly Thr val sly Thr pro 

30 " 35 40 

gtt tea att qqc aat ggc aaa ggc gtt ttt gaa cga age att ttt ccg 750 

Val Ser lie Sly Asn sly Lys sly val Phe slu Arc? ser He Phe Pro 
4 5 ' 50 * 55 

ggc aat gat tea gca ttt att aqa qqc aca tea aat ttt aca ctt aca 798 

Sly Asrs Asp Set Ala Phe val Arq slv Thr ser Asn Phe Thr Leu Thr 
60 65 70 

aat ctg gtt tea aqa tat aat tea qgc qqc tat gca aca gtt gca ggc 846 

ash tew. Val Ser Arg Tyr Asn ssr sly sly Tyr Ala Thr val Ala my. 

75 80 85 90 

cat aat caa gca ccg att ggc tea gca qtt tgc aga tea qqc tea aca 894 
His Asn Gin Ala Pro lie Sty Ser Ala val cys Arg ssr sly ssr Thr 

95 100 105 

acq qgc tgg cat tqc qqc aca att caa aca aqa aat caa aca gtt aqg 942 

Thr Sly Trp His cys Giy Thr lie Sin Ala Arq ass Sin Thr Val Arg 

110 ' 115 120 

tat ccg caa ggc aca ctt tat agt ctg aca aga aca aca qtt tqt gca 990 

Tyr Pro sin sly Thr val Tyr ser Leu Thr Arg Thr Thr Val Cys Ala 
125 ' 130 135 

qaa ccq qqc gat tea qqc qqc tea tat att age ggc act caa qca caa 1038 

Glu Pro Giy Asp Ser slv sly Ser Tvr lie Ser Giy Thr Sin Ala Sin 
140 ' " 145 130 

ggc gtt aca tea ggc qqc tea qgc aat tgc agt get ggc ggc aca aca 1086 
slv val Thr Ser sly Giy ser Giy Asn Cvs see Ala Sly sly Thr Thr 

155 "100 165 170 

tat tac caa qaa qtt aat ccg atg ctt aqt tea tgg ggc ctt aca ctt 1134 

Tvr Tyr sin Slu Val Asn pro Met Leu 'Sir Ser Trp Giy Leu Thr Leu 

175 180 185 

aga aca caa teg cat gtt caa tec get cca 1164 
Arc Thr Gin ser His val sin ser Ala Pro 
190 19S 
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im 2 3 . ~m « S'f'2 5 . txt 

<210> 4S_ 
<<;'.i.,t> 

<21Z> PRT 

<2I3> artificial sequence 
<220> 

<223> Synthetic Construct 
<40Q> 4S 

Met tys ivs Pro teu <slv Lys lie val Ala ser Thr Ala Leu Leu 
-190 -1S5 -180 

lis ser val Ala Phe ser Ser ser lie Ala Ser Ala Ala Thr sly 
-175 -170 -16$ 

Ala Leu Pro Gin ser Pro Thr pro Slu Ala Asp Ala Val Ser Met 
•-160 -155 -150 

Gin Glu Ala leu Gin Ar« Asp Leu Asp Leu thr Ser Ala Glu Ala 
•145 -140 -135 

Glu Glu leu Leu Ala Ala Gin Asp Thr Ala Phe Glu Val Asp Glu 
-130 -125 -120 

Ala Ala Ala Glu Ala Ala Glv asp Ala Tyr Gly <s1y Ser val Phe 
-IIS -110 -105 

Asp Thr Glu Ser Lm Glu leu Thr val Leu val Thr ASp Ala Ala Ala 
-109 -SS -90 

Val Glu Ala Val Glu Ala Thr Sly Ala Glv Thr val Leu val ser Tyr 
-85 -SO -75 

Glv lie Asp Gly teu A$p Glu lie val Gin Glu Leu Asn Ala Ala Asp 
-70 -65 -60 -5S 

Ala val Pro Gly val val Gly Trp Tyr Pro Asp val Ala Gly Asp Thr 
-SO -43 -40 

val val Leu Glu Val Leu Glu Gly ser Gly Ala Asp val ser Gly Leu 
-35 -30 -25 

Leu Ala Asp Ala Gly Val Asp Ala ser Ala val Glu Val Thr Thr ser 
-20 ' -IS -10 

Ass Gin Pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu Ala Tyr Tyr 
' -s ~1 1 5 10 

Met Glv Glv Arg cy$ Ser Val Gly Phe Ala Ala Thr Asn Ala Ser Gly 
IS 20 25 



WO »3/1 11219 
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10423 . 2Q4~m . ST2 5 < tsst 
Gin Pro Sly Phe Val Thr Ala sly His cys Sly Thr Val Sly -Thr Pro 
30 35 40 

val ser lie «1v Asn Sly tys sly val phe slu Arg Ser lie phe Pro 
45 SO SS 

Gly Asn Asp ser Ala Phe Val Arg Sly Thr ser Asn Phe Thr Leu Thr 
60 SS 70 

Asn Leu val ser Arcs Tvr Asn Ser Sly Sly Tyr Ala Thr val Ala Sly 
75 ' 80 B5 90 

His Asn Gin Ala Pro He Sly Ser Ala val Cys Arg ser Gly ser Thr 
9S ' 100 IDS 

thr Sly Trp His Cys Gly Thr lie ©In Ala Arg Assi Sin Thr Val Arg 
110 ~ US 120 

Tyr Pro sin Gly Thr Val Tyr Ser teu Thr Arg Thr Thr val Cys Ala 
125 " 130 135 

Giu Pro Sly Asp ser sly Gly ser Tyr He ser sly Thr sin Ala Gin 
140 " 145 ISO 

Gly Val Thr Ser Gly sly ser Gly Asn Cys ser Ala sly Gly Thr Thr 
15S * 165 165 170 

Tyr Tyr 41 ft Slu val Asn Pro Met teu Ser Sen Trp Sly Leu Thr Leu 
175 180 1S5 

Arg Thr Gin Ser His Val sin Ser Ala Pro 
190 15*5 

<210> 46 
<211> 165 
<212> PR! 

<213> Artificial sequence 
<220> 

<il%> shuffled pro-peptide 0-2.19 

<zm> 

<Z21> PROPEP 
<2ZZ> CD-- (1853 

<400> 48 



Ala Thr sly Ala Leu Pro sin Ser Pro Thr Pro slu Ala Asp Ala Val 
1 5 10 IS 

ser Met sin slu Ala teu Gin Are. Asp Leu Asp Leu Thr Ser Ala Glu 
20 ' 2S 30 

Ala Glu Glu ueu Leu Ala Ala Gin Asp Thr Ala Phe Slu val Asp Slu 
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10428 * 2G4-WQ, SX2S . txt 
35 40 45 

Ala Ala Ala Ala Ala Ala sly Asp Ala Tyr Gly Gly ser val Phe Asp 
50 55 BO 

Thr g'I u ser leu Thr Leu Thr val Lay val Thr Asp Ala Ser Ala val 
OS 70 75 SO 

61 u Ala Val Glu Ala Ala Gly Ala Gl» Ala Lys Val Val .Ser His Gly 
85 90 95 

mt Glu Gly Leu Glu Glu lie Val Ala Asp Lets Asn Ala Ala Asp Ala 
* 10(5 105 110 

Gin Pro Gly val val Gly Trp Tyr pro Asp lie His Ser Asp Thr val 
115 120 125 

Val teu Glu val lsu Glu Gly ser Gly Ala Asp val Asp ser Lm Leu 
130 135 140 

Ala Asp Ala Gly val Asp Ala ser Ala Val Glu Val Thr Thr Ser Asp 
145 ' ISO 155 160 

g1« Pro Glu imi Tyr 
165 

<2X0> 47 
<211> 166 
<212> PRT 

<213> Artificial sequence 
<22Q> 

<223> shuffled propeptide g-2.73 
<220> 

<221> FROPEP 
<222> (1) - - (166) 



<400> 4/ 

Ala Thr Gly Ala Leu pro Gin ser Fro Thr wo Glu Ala Asp Ala Val 
1 ' 5 10 IS 

ser Met Gin Glu Ala Leu Gin Arg Asp Ley Asp Leu Ser ser Ala Glu 
20 25 30 

Ala Glu Glu Leu Ley Ala Ala Gin Asp Thr Ala Pha Glu Val Asp Glu 
IS 40 45 

Ala Ala Ala Sly Ala Ala Sly Asp Ala Tyr Gly Gly ser val Phe Asp 
SO SS 60 

Thr Glu rhr Leu Glu Leu Thr Val Leu Val Thr Asp Ala ser Ala val 
65 70 75 80 
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slu Ala val Glu Ala Ala sly Ala Gits Ala Lys val val ser His Gly 
85 90 95 

Met Glu Gly Leu 61u Glu xl e- Val Ala Asp Leu Asn 'Ala Ala Asp Ala 
100 105 HO 

Girt Pro Gly val val Gly rro Tyr pro Asp lie His ser Asp Thr val 
115 ' 120 12$ 

val val Glu Val Leu Glu Gly ser Glv Ala Asp Val Asp ser Leu Leu 
130 135 140 

Ala Asp Ala Gly val Asp thr Ala Asp val Lys Val Glu Ser Thr Thr 
1.4 5 150 1SS 160 

Glu Gin Pro Glu Leu Tyr 
165 ' 

<210> 4S 
<211> 166 
<212> PET 

<213> Artificial sequence 
<220> 

<223> shuffled propeptide G-1.43 

<220> 

<22.t> PROPS P 
<222> (1) , , (166) 



Ala Thr sly Ala Leu pro Gin ser pro Thr Fro Glu Ala Asp Ala val 
l 5 10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Ser Ser Ser Gin 
20 25 30 

Ala Glu Glu Leu leu Asp Ala Gin Ala Glu Ser P&e Glu lie Asp Glu 
35 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly Ser lie Phe Asp 
SO 55 60 

Thr asp ser Leu Thr Leu Thr val Leu Val Thr Asp Ala ser Ala val 
65 ?0 ?5 80 

Glu Ala Val Glu Ala Ala Gly Ala Glu Ala Lys val Val ser His Gly 
8$ SO 95 

Met Glu Gly Leu Glu Glu lie val Ala Asp Leu Asu Ala Ala Asp Ala 
100 105 110 



wo mwi mn vcrmKmwmiim 

10423, txt 

Gin Pro Gly Val val Gly Trp Tyr pro Asp lie Mis ser Asp Thr val 
115 120 125 

val Leu Glu val teu Glu Gly Ser Glv Ala Asp Val Asp ser ten Leu 

155 146 



Ala Asp Ala Gly Val Ass Thr Ala Asp Val tys Val 61 u Ser Thr Thr 

145 150 155 160 

Glu Gin Pro Glu Leu Tyr 
165 



<210> 49 

<2T1> 166 

<212> PRT 

<2X3> Artificial sequence 
<220> 

<223> Shuffled propeptide 6-2 J 



<m> PROPER 
<222> (1) , - (166) 

<4(50> 49 

Ala Thr Gly Ala km Pro 6ln Ser Pro Thr Pro Glu Ala Asp Ala val 
1 5 10 IS 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser Ala Glu 
20 ' 25 30 

■Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala phe Glu val Asp Glu 
35 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly Ser He Phe Asp 
50 55 60 

Thr Glu Thr Leu Glu Leu Thr val Leu Val Thr Asp Ser Ser ser Val 
65 70 75 80 

61 u Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val Ser His Gly 
85 90 95 

Met Glu 61 v teu Glu Glu lie val Ala Asp Leu Asu Ala Ala Asp Ala 
' 100 105 110 

Gin Pro Glv Val val 61 v Tro Tyr Fro Asp lie His ser Asp Thr Val 
115 120 125 

val teu Glu val Leu Glu 61 y Ser Gly Ala Asp val Asp ser Leu teu 
130 135 140 



WO 2AS-3/1 11219 



FCr/»K2lN>4/tf*KH3I 



10423. ?04-*o. S7.>5 . txt 



Ala Gly Ala Sly val Asp Thr Ala Asp Val Lys ml sly Ser Thr Thr 
145 ISO 155 160 

Glu Gin Pro Glu leu Tyr 



<221> PR0*£P 
<222> (I) , , (165) 

«400> SO 

Ala rhr Sly Ala Leu pro sin ser pro; Thr Fro Sly Ala Asp Ala Val 
I S 10 15 



Ssr Met sin Glu Ala Leu Gin Arq Asp Leu Slv Lou Thr Fro Leu S?u 
20 " 23 30 



Ala Glu Gly Leu Leu Ala Ala Gin .Asp Thr Ala Phe Glu Val Asp Glu 
35 40 45 



Ala Ala Ala Glu Ala Ala sly Asp Ala Tyr Gly Gly ser val Phe Asp 
50 55 60 



Thr Glu Thr Leu Sly Leu Thr val Leu val Thr Asp Ala ser Ala val 
85 70 75 80 



Sly Ala 'val Glu Ala Ala Sly Ala Glu Ala Lys val val ser His Gly 
85 90 9S 



Met Glu Gly Leu Sits Sly lie Val Ala asp Leu Asn Ala .Ala Asp Ala 
' 100 105 110 



Gin Pro Gly Val val Gly Tro Tyr Pro Asp lie His Ser Asp Thr val 
115 ' 120 125 



Val Leu Glu val Leu Glu Gly ser Gly Ala Asp Val Asp ser teu leu 
130 135. 140 



Ala Ass Ala Gly Val Asp Ala ser Ala Val Glu Val Thr Fro Ala. Ala 
145 ' 150 155 160 



Arq pro Glu Leu Tyr 



165 




IS 5 
PRT 

Art! fi ci al sequence 



<22Q> 
<223> 



shuffled propeptide G-2,5 



:0> 



<210> SI 



rage 34 



WO 2*f»4/1 1.1219 



FCr/»K2lN>4/tf*KH31 



10423 .204 -WO- ST25 . txt 




166 
PRT 

Artl f 1 ci al sequence. 



<22Q> 
<223> 



Shuffled propeptide 6-23 




Ala Thr sly Ala Leu Pro Gin Ser Pro Thr Pro Asp Sly Ala 6tu Ala 
i ' 5 20 IS 



Thr Thr Met val slu Ala Leu Gift Arq Asp ieu Sly leu Thr pro Ala 
20 25' 30 



<3lu Ala slu Slu Leu Leu Ala Ala Girt Asp Thr Ala Phe 61 u Val Asp 
35 40 45 



slu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr sly Sly Ser He pb« 
50 55 ' 60 



Asp Thr Asp Ser Leu Thr Leu Thr val Leu val Thr Asp Ala Ala Ala 
SS 70 75 80 



Val slu Ala Val Slu Ala Ala Gly Ala gTu Ala tys val Val Ser His 
85 90 95 



Gly Met Glu Gly Leu Glu Glu lie Val Ala Asp teu Asa Ala Ala Asp 
100 105 110 



Ala val Pro sly Val Val Sly Thp Tyf Pro A$p val Ala sly Asp Thr 
115 * 120 ' 125 



val Val Leu Glu Val teu Glu Gly Ser Gly Ala Asp val Tyr Ser Leu 
130 13S 140 



Leu Ala Asp Ala Gly Val Asp Ala ser Ala Val Glu Val Thr Pro Ala 
145 " 150 155 160 



Ala Gin Pro Glu Leu Tyr 



<it.i..L,> IQO 
<Z12> PRT 

<213> Artificial sequence 





shuffled propeptide 6-1.4 



<22Q> 
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WIS 2*f»4/1 1.1219 



FCr/»K2lN>4/tf*KH31 



104^J.204-WO.ST25. txt 



<2Z1> PROPER 
<2Z2> CO - « (166) 

<A0Q> 52 

Ala Thr sly Ala Leu Pro sin ser pre Thr pro Sla Ala Asp Ala val 
1 S 10 15 



Ssr 5iet Gin slu Ala Leu Gin Arg Asp Ley Gly Leu Ser ser ser Gin 
20 2S 30 



Ala Si a 61 u Leu Leu Asp Ala Gin Ala slu ser pfis Glu He Asp Glu 
35 40 4S 



Ala Ala Ala Ala Ala Ala Ala Asp ser Tyr Sly Gly ser He Pha Asp 
SO 35 80 



Thr as® ser Leu Thr Leu rhr val Leu val Thr Asp Ala ser Ala Val 
65 70 75 SO 



61 u Ala val Glu Ala Ala Sly Ala 61 a. Ala Lys val val ser His Sly 
85 90 95 



Met Glu sly Leu Glu Glu lie val Ala Asp Leu aso Ala Ala asp Ala 
■100 105 110 



Gin Pro siv val Val sly Trp Tyr Pro Asp lie His ser Asp Thr val 
IIS 150 125 



Val Leu Glu val Leu Glu Gly ser sly Ala Asp val Asp Ser leu Leu 
130 133 140 



Ala Asp Ala Sly val Asp Thr Ala Asp val Lys val Glu ser Thr Thr 
145 150 155 160 



slu Sin Pro Glu Leu Tyr 
365 



<210> 53 

<211> 168 

<212> PRT 

<213> Artificial sequence 
<22Q> 

<223> Shuffled propeptide S-1.2 



<400> S3 

Ala Thr sly Ala Leu Pro Sin ser Pro Thr Bra Glu Ala Asp Ala val 




PROPER 
(1) * . (166) 



1 



15 
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wo mwi mn vctmKmwmiim 

10423 « 204~» , ST2 5 . txt 
Ser mt (sin Slu Ala Ley Gin Arg Asp Cm: Mp Leu Thr ser Ala sly 
20 " 25 30 

Ala Glu 61 u teu Leu Ala Ala <sln Asp Thr Ala j?he Sly val Asp Glu 
33 40 45 

Ala Ala Ala Ala Ala Ala Sly Asp Ala ryr Sly sly Ser lie Phe Asp 



Thr 01 u Thr Leu Slu teu Thr val teu Val Thr Asp Ser ser ser Val 
65 70 75 SO 

Stu Ala val 01 u Ala Ala sly Ala Sly Ala Lys val val Ser His Sly 
85 90 35 

U&t STu Sly Leu ol u slu lie val Ala asp tea Ash Ala Ala Asp Ala 

105 110 



sin Pro o1 y val val sly frp Tyr pro Asp lie His ser Asp Thr Val 
US 120 alf 

val teu 01 u val Leu 01 u sly ser sly Ala Asp Val Asp Ser Leu Leu 
130 135 



Ala Sly Ala Sly val Asp Thr Ala Asp Val Lys val slis Ser Thr Thr 
145 ISO 155 160 

olu sin Pro Slu Leu rvr 
165 
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tm COS}} 18 July 2002 ( 2002-07-18) 
page 8, Tine 23 - page 9, line 20 
page 16, Hoes 19-2.8 
claims 1,18,30,37 



m 01/58276 A (HOFFMAN Li ROCHE -; 
0ESTERSAARD PETER m®m (OK); &J0EH0LH 
CARSTEN (D) 16 August 1801 (2001-88-16} 
page 4, Hoes 13-34: claifss 1-12 • sequence 
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